MAMP and virtual local hosts / IPv6

Recently I recognized that my local access to virtual hosts created with MAMP ( http://mamp.info/ ) is pretty slow. It takes a couple of seconds to actually send the request.
MAMP (Pro) adds virtual hosts to the /etc/hosts file on mac with 127.0.0.1 as its IP-address, so that I could just enter http://myapplicationname:8888 to get to the local development environment of my app.
As I got similar timeouts accessing my hosts when I enabled IPv6 for them I guessed that there was a connection.
Adding the 127.0.0.1 equivalent for IPv6 ( ::1 ) did the trick then finally.
Add
::1 myapplicationname
to the /etc/hosts and it works blazing fast as regular.

Seems like OSX nowadays always does an IPv6 lookup, even for localhosts.

Phorum has moved to Github

Long time, no post🙂.

Effective by today Phorum has moved to GitHub (https://github.com/Phorum/Core).
That means that our code repository was converted to git (I’ve had to use svn2git as the github-internal import didn’t pick up our tags and branches) and the trac tickets were imported to github issues (I wrote my own php script for that because none of the two existing scripts worked for me. Somehow they all barfed at some broken charset characters from trac or didn’t take the api request limit from github, 60 requests per minute, into account. So I wrote my own script which was taking like 3 hours to import our 900 tickets but was a breeze to implement with the well documented github api – and its php library ;-)).
The issues in github don’t support attachments yet which means that only the comments are ported over – and a link back to the trac install is provided.

One missing part is the wiki import but I’ll see what I can use of the old data.
Also our release scripts aren’t changed yet but I’ve got to see if they might just stuff the generated docs into the tagged release as github creates tarballs / zip archives from the existing tags already. Now we’d just need to download and put them on phorum.org too😉.

It took me a while to get my local development environment to work with git but a first commit to our 5.2 branch is done now, so that actually works. Working with branches in git made my head hurt but as long as works I don’t really mind. I’m currently evaluating phpStorm for my development work which has internal github and git support and seems to handle that in an accessible way without having me to look into the internals of git too much.

HP SmartArray controllers penalizing “foreign” harddisks?

Another story from the hardware perspective.
As a backup server I’m running an old-ish DL380 G3. It doesn’t have too much power but can store some backups with 6 x 146GB SCSI drives in a Raid 5 configuration.
Some weeks ago one of the disks in the array died and needed replacement. As the machine is far out of guarantee I checked the costs for harddisks and went with a seagate drive in an hp tray, it has similar specs and hp seems to use seagate for at least some of its own drive offers.

Guess what happened to performance?
Rebuilding the array took like 20 hours. I wouldn’t have thought that it would be that bad but well, as long as it works … .
Afterwards I was shocked by backups taking more than half a day, where it usually takes 1-2 hours in the night. IOWait was around 80% by the time.
Hence I checked the io performance with iozone and got around 3MiB/s write speed. Doh!
I checked the results from the controller firmware … all disks are fine, no error, nothing, just that the last disk doesn’t have hp/compaq firmware.

That was with the internal HP SmartArray 5i with battery backed write cache. That controller supports only the older 160 MB/s SCSI standard. Well, old controllers are cheap. Let’s try a SmartArray 641 with BBWC which supports 320 MB/s.
Write speed went up to 5 MiB/s, great!😉 Not really but better. Now I got another original hp harddisk and tried again … write speeds went to 22-41 MiB/s (without BBWC because the battery died)!

So, I’ve read and heard of mixing drives from different manufacturers in a RAID array is bad. But that bad? I mean, the drives have similar specs (SCSI U320, 10k, similar access times and all), what can go wrong with that? Still that could be the issue.
On the other hand, as the other hp harddisk made the difference I assume that the controllers are penalizing foreign harddisks. Either because of them not going through their own QA or to fuel their replacement parts business.

Go figure!

PS: one good thing about the SmartArray controllers I found in this process: they are really data compatible. The better SmartArray 641 recognized the array without problems. Turning off the SmartArray 5i brought the system suddenly back to life, no need to change or configure anything – beside the performance.

IPv6 experiments / lessons learned

During the last couple of days I did some experiments with IPv6 connectivity / applications / configuration.
For nearly two years I already got two sixxs.net tunnels. One for a server and one for my home connectivity.
I never got aiccu working on Mac OSX so the home tunnel was down most of the time.

Finally it got to me and I worked on getting 2 subnets now, again, one for the home network and one for the servers.
For the gentoo servers I used the router howto from http://www.gentoo.de/doc/de/ipv6.xml with the radvd configuration.
RADVD is a router advertisement daemon for ipv6 networks. IPv6 has a mechanism for auto configuration where the router advertisement daemon sends advertisements about the supported prefix (aka network/netmask in IPv4 world) and its own ip address for the gateway. So far it seems like most ipv6 stacks have this auto configuration included by default so every IPv6 enabled server in the reachable network suddenly has a IPv6 address. I never knew that that many servers of mine are IPv6-enabled and even quite some servers of my isp were suddently connected through IPv6 (getting me a curious call of my ISP ;-)).
Thats the first thing to be worried about, suddenly they are all connected to the big bad internet without correct reverse dns entries, firewalls and the like.
Speaking of firewalls, usually you don’t have a IPv6 firewall up at this moment. Your old ipv4 firewall rules won’t catch any ipv6 traffic. Therefore, again, every IPv6 enabled host is exposed to the world without proper protection. Thats even worse if you open a tunnel to your home network as the home network is most often connected through some router doing nat and internally just using private ip addresses so that the hosts are not exposed to the outside world at all. With opening the tunnel and enabling the radvd service you got them out in the open world either.

On my home network I got a CentOS5 server running which is doing some smb service and the like.
I got that one connected to the sixxs tunnel and started the radvd service on that box. So far so good, Mac OSX has IPv6 enabled with autoconfiguration by default so. So the hosts got the IPv6 addresses and routing.
ping6 worked (btw. nice to have most tools available as ipv6 cmds with just 6 at the end) but the browser delivered no IPv6 website. There you are, CentOS5 / RHEL HAVE a ip6tables ruleset enabled by default and that one was just open for icmp (ping) messages. Good protection but cost me a while to diagnose. So I opened some more loopholes for the IPv6 connection on the home network for smtp, imap, http, https and dns and still let the radvd daemon running.
At the server network I disabled the radvd service and manually set ipv6 addresses and gateway so that I won’t disturb neighbours in the network anymore🙂. A strict ip6tables ruleset was enabled too.
For fun I went through the IPv6 certification by HE.net and got as far as to prove that I got:

  • ipv6 connectivity
  • an ipv6 enabled webserver
  • an ipv6 enabled mailaddress (yes my main mysnip.de mail address is now ipv6 enabled!)
  • reverse dns entries for my ipv6 enabled hosts (powerdns has no problems with that)

The step which still gives me trouble is that I can’t give fully ipv6 enabled nameservers to the outside world. My main nameserver is ipv6 enabled but the secondary ones from inwx.de don’t have ipv6 connectivity or AAAA entries so there’s not much I can about it.
Skimming through the maillogs on my mailserver I was stunned to see that *a lot* of spam is trying to deliver through IPv6 already. postgrey is working with ipv6 without trouble, amavis / spam assassin too so there’s not really a problem. Seems like spammers adapt more quickly to the new technologies though. On the other hand I found that freenet.de (a german ISP) got its mailservers connected through IPv6 already and is publishing AAAA entries for them. Therefore some mail is already delivered through IPv6.
In the near future I might try to offer some experimental IPv6 access to the services provided but without any native ipv6 connectivity (anyone knows if TeliaSonera is offering it and if it poses additional costs?) that doesn’t make too much sense for production.

At least now I can check how the applications I’m using and providing are working with IPv6. Also Phorum needs to be checked for that.

MySQL in Gentoo …

Merely as a note to myself I just looked around the Gentoo bugtracker to learn about the current state of MySQL in Gentoo.
So far I found two related bugs:
About 5.0.x
http://bugs.gentoo.org/show_bug.cgi?id=279493

What I learned from this bug:
Recent dev-db/mysql versions contain most of the percona patchsets already (neat!). Dunno about xtradb so far.
Latest in tree is 5.0.84 which I’m trying on a backup system now (even though is marked as ~x86/~amd64 aka testing.

About 5.1/5.4
http://bugs.gentoo.org/show_bug.cgi?id=194561
5.1 is going to be put into dev-db/mysql too (not in dev-db/mysql-community as it was before because of the changed development model)
Quoting Robin Johnson:

“I intend to issue a package move after we’ve had a few versions >=5.0.83 in stable, but there is no further need to make new -community ebuilds.”

So there will be 5.1.x in tree once some more recent 5.0.x versions were released as stable. Latest stable mysql version in tree is 5.0.70.

Seems like he’s also keeping back because of some more breakage in earlier 5.1 versions.
As another quote:

“I’m aware that 5.1.30 is out. However it’s still in bad bad shape. […] It certainly ate some of my data when I tested it.”

For the topic whats keeping them from stabilizing later mysql (> 5.0.70) versions I found some quote from Robin Johnson too (who seems to be THE mysql maintainer in Gentoo):

My most defining test for putting MySQL builds in the tree has been that it
passes both of the following:
1. Passes it’s own testcases (upstream has been atrociously bad at this, see
status2 in 5.0.72 for example)
2. It doesn’t eat my data or break my systems.

#1 is pretty easy as a start point, seeing if it works.
#2 is a lot tougher:
– 5.0.70 is the best option for now.
– 5.0.72 breaks most of the statistics code out there really badly (I filed
upstream bug 41131) – changes to SHOW behaviour as well as the ‘Questions’
variable.
[…]
Having upstream do sane releases is part of why there has been so long between
my 5.0.x bumps, because they haven’t passed my personal testing.

So far its mostly general stuff and I don’t know if MySQL-5.0.x still doesn’t pass its own testcases or anything. I couldn’t find more detailed stuff in the bugtracker.

MySQL proliferation

Long time no post but thats some stuff lingering in my head for a while looking at the mysql ecosystem lately.

For a long time I had to stay with MySQL-4.0 (sick I know) but there’s a lot of software to adapt to the new version(s) but lately I’m pushing it more and more into mysql 5.0.
MySQL 5.1 would be also interesting and even MySQL-5.4 … but thats where the trouble starts.
MySQL-5.4 came out of the dark, no one expected it and it gave numerous improvements … though its still in beta.
With that release (or at least around this time) MySQL started to change its release model to something new where the version numbers matter far less and where there should be regularly released. Well, Oracle has bought Sun which owns MySQL … so we will see what the next “release model” will be.

On the other hand there are some “forks” of mysql out there which offer more improvements or at least they are supposed to do this.
For one there is XtraDB which is supposed to be just a replacement for the InnoDB-plugin now … while it had more patches to the main mysql server before as far as I remember – so its not really a fork, just another storage engine.
Edit: just found a newer release – its called “MySQL with Percona patches”

Then there is MariaDB which should be a “… community developed branch of the MySQL database that uses the Maria engine by default.” (quote from the linked page) which is developed by Monty Program AB and the OpenDatabaseAlliance.

Being “only” a collection of patches and builds of MySQL with patches is OurDelta.
I’m pretty sure that there are more forks or patch collections out there, please forgive me if yours isn’t listed.

But here is the question. Which MySQL version / patchset / fork should one use?
Previously it was just the question of using the commercial enterprise or the community version. Now I’m just confused.

Next problem is the distribution on linux …
Nearly all my servers are using Gentoo as the linux distribution but have you ever looked at the status of mysql in there?
The latest “stable” version is dev-db/mysql-5.0.70. dev-db/mysql-community-5.1.21_beta is in there, but not marked as stable (5.1.39 being the last one on the mysql homepage), 5.4.x is nowhere to be seen, same for XtraDB, MariaDB … . I don’t blame it on the maintainers – who should keep up with that flow of releases, different branches and/or forks. I also heard rumours that all the latest versions are failing numerous tests for the maintainer(s) and therefore won’t go in.
If I want to use a recent stable version or one of the enhancements I will have to do my own builds instead of using the great gentoo infrastructure for that. I could, sure, but time is low and I would get far further if I could use the regular way to install (and update) software on Gentoo. I don’t have a full blown infrastructure team to keep the systems going.

If I wouldn’t have too many software using raw database calls in php or perl I would seriously consider switching to some other database, PostgreSQL or the like. I heard even Maurice considering switching to Postgresql in the future and supporting the Postgresql layer for Phorum.

If there is no one stopping this proliferation of the mysql ecosystem and providing reliable and regular stable builds of a tree, I think Oracle won’t have to worry about MySQL anymore. There will be not too much left of its userbase and community.
But maybe its just me, painting things black, who knows😉.

Dell 2650/PERC 3/Di with kernel > 2.6.22 and XFS

As it took me a day to find out I wanted to post my findings here too.

I got a used Dell Poweredge 2650 and (as usual) installed Gentoo on it. First I got a faulty harddisk in the RAID5 and rebuilding took like 6 hours.
So I didn’t mind slow io performance which I accounted to the rebuild in process.
Unfortunately it still didn’t get better when the rebuild was finished. Taking seconds for a simple “ls”, installing gentoo-sources took more than a hour and the like. I did all firmware bios updates until none were available anymore. Still, no dice.
Searching around the Web I stumbled about this post and this post (from the same author) which are pointing to issues with the most recent aacraid driver but no relation to XFS yet.
Nearly convinced to downgrade the kernel or at least the aacraid driver I did a search in the gentoo forums and finally found the solution.
Mounting the XFS filesystems with nobarrier brought the speed back to normal. Personally I would have never thought of that solution but it seems like the newer aacraid doesn’t report back that write-barrier is a bad idea on the PERC 3/Di.

Now up for the task to try to get OpenManage running on gentoo … lets try if and exotic approach helps.

Nginx, finally!

Seeing the notice that the license on my Litespeed webserver is expiring again (yearly payments😦 ) I finally started to move my sites to nginx (together with a move in datacenters so that webserver configuration was to be done anyway).
There were some more webservers in the run but I ended up with nginx.
Some others, lighttpd (got a bit silent over there and I don’t want to put my sites on a dying project), cherokee (now even with a webinterface!, but documentation is a bit sparse and the latest release seems inconsistent with the configuration – I simply couldn’t find out to do what I wanted to do) and the original Litespeed webserver.
In the end I wanted to come back to an open source webserver which doesn’t lock me in like that.
LSWS had some regressions in the last versions and one always has to wait for the developer team to fix them (even though they are quick) as no one else can dig into the code and also no one can write modules or enhancements because of the closed source.
Also there were some features which are now only available in the enterprise (aka paid) version which I don’t want to be forced to use forever. Also in the last year(s) its simply more directed to hosting companies or similar which are using native httpd.conf files and not doing the configuration in the webinterface they are offering. Some features are even only working with using httpd.conf entries.
Oh and the free version doesn’t offer x86-64 versions therefore I needed compat libs.
Therefore better do the cut now and use something else.
Nginx has the fastcgi loadbalancing I want, rewrite rules, great configurability and a very active community (and developers).
The only thing I’m really missing there is the possibility to use .htaccess files which forced me to search for the .htaccess files and turn their rules into native nginx configuration entries. Oh, one feature I forgot, reloading the configuration without doing a full restart of the webserver is neat too🙂.
All issues I had could be quickly solved by either searching the maillist archive or posting there.

Don’t get me wrong. I still recommend LSWS to users who want to have an easy to use webserver with great performance as a drop in replacement for Apache supporting most of the previous features out of the box but its simply not for me anymore.

The next steps for Phorum

Now that Phorum-5.2 has finally gone stable there will be hopefully some better modules as the possibilities have been vastly increased. One of the new modules for 5.2 which show quite some of the abilities is the rewritten user-avatar module for 5.2.
With modules you can use now (not everything is new in 5.2!!!):

  • use a supported API for files, users, custom profile fields and similar stuff
  • ability to hook the module-css into the css loaded by phorum for valid (x)html pages and not loading it separately (saving requests)
  • ability to hook the module-javascript into the javascript loaded by phorum for … see above😉
    (both can use raw files, templates, functions for including it)
  • can do database calls without writing database dependent code (could still be because of the queries themselves)
  • use module-templates which are included in the module itself, no need to copy them to the template folder(s)
  • language files in the modules themselfes
  • adding controlcenter panels without copying files around

Also our module list for 5.2 is now auto-generated from the modules posted into the 5.2-modules forum in the right format.
Make sure to add categories too as listed in the docs!

So, now that we (could) have better modules, whats next?

Dan Langille has been working on a postgresql-layer for 5.2/5.3 which will probably be included in one of the later 5.2 release as a beta of this layer.
The next big release will be Phorum-5.3. Our plan for Phorum-5.3 is “just” to add even more APIs, changing large parts of the backend without touching much of the frontend code.
Therefore templates from 5.2 should work without a hitch with 5.3. Maybe there will be added features missing in the old template but otherwise it will continue to work as before.
I know we made it hard for some admins with the switch from 5.0 to 5.1 and 5.1 to 5.2 but all these changes were done for flexibility in the templates and making them far more consistent and therefore easier to implement.
Some of the APIs will be about forum handling and similar stuff so that you can build a new admin or an admin in another page far easier than before.
As usual you can see the tickets on the table for 5.3 in our ticket-list (from the 5.3-milestone).

And further in the future?
There is lots and lots of stuff in the ever growing Ideas-milestone.
We’ll see if any of this will see the light in 5.3 already or in a later release but we surely won’t get bored😉.
I’m pretty sure that lots of stuff will be done at the MySQL Conference 2008 like last year where we’ve been coding and presenting there with lots of feature tickets closed for 5.2. You can help us to get there with donating to phorum.org!

laws and the use of logging IPs

in the light of recent court-decisions in germany ( german article ) which essentially disallows logging of IPs I’m wondering what one would really need it for?

I’m using IP-logging/-tracking in multiple ways:
1. statistics about visits and recurring users
2. storing it with forum-posts to allow law enforcement in case some user really goes over the line
3. tracking requests in a given time by IP to automatically block potential attacks

So what of that could be avoided?

For 1. , one could just ignore logging the ip but trying to count visits and recurring users would be impossible with that. What now? Maybe logging a md5/shaX of the ip to have some unique key per IP? Wouldn’t that still fall under the rule from the court as you could find out which was the actual IP?
Counting visits is an important tool for getting advertisers to advertise at a page (In my opinion). Any ideas?

For 2. , guess one could disable that but would I be responsible then for each and every forum-post because the real poster can’t be retrieved? (Yeah, laws in german are bad for the one offering the forum after all😦 )
On the other hand there is the upcoming data retention ( german news collection about this topic ) which is planned for keeping all records for 6 months (!!!). So for now I should remove all tracking of ip-addresses just to be forced to store it for 6 months a while later?

For 3. , this behaviour gives me another problem too. Trying to load-balance over multiple webservers usually goes through a reverse proxy in front of the webservers which would always give the REMOTE_ADDR of the reverse-proxy to the apps. So the reverse-proxy would need to add this security layer. But I really failed to find one doing this up to know.
But is that really needed and I’m just oversensitive in this area? Do I need to accept any number of requests/s from any user?

Are there other use-cases for logging IPs?

How are other users handling this?