Issues with disk space due to too many inodes

We have been seeing an issue in our private instance due to too many “result” data files in webpagetest server. We fired off a big load on our private instance, and after a while, the server started complaining about disk space issue. On investigating, we seem to have ran out of inodes (rather than actual disk space) and we suspect it is because of a large number of “result” data files created for each measurement. For what it’s worth, we are using ext4fs.

Is there a way to tune the file system to deal with this in a fashion other than purging/archiving old data? Some file system tuning option maybe?

I’m facing the same issue with private instance running from the AWS AMI. Haven’t found a solution yet…

Ooh … Linux - my favorite subject…

The best option - create a ZFS partition for your result data files.
See http://zfsonlinux.org/

I would personally keep everything else on the ext4 partition.

If you want to use an ext4 partition, you want to create a partition with a high inode count (few bytes per inode) like the ‘news’ type.
Create a new partition with something like:

mkfs.ext4 -T news /dev/yourpartition

A better solution:
Use ZFS - it allocates inodes on the fly.

http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg43751.html

Thanks. This was an issue with the ‘out-of-the-box’ AWS machine image (AMI), so changing the FS might be a good suggestion for an AMI update.

Ah … I shot a message to pmeenan about this.
Maybe he can chime in.

One thing on my todo list that will help significantly would be to archive tests in-place into zip files after the test is complete and all writes have finished. That would replace the result directory with a result zip file and then all of the code would extract the files they need at runtime directly from the zip.

That would reduce inode usage at least an order of magnitude and also reduce wasted disk space from block allocation.

It shouldn’t be too hard to pull off. Most of the file reads are already centralized so it would just be a matter of hooking into them. Reading individual files out of a zip should also be plenty fast since you can random seek directly to them.

In the short term the best answer is to configure the server to archive to a different path on the same server (/var/wpt-archive or something like that) which will zip up the tests and get the same inode savings but keep the results on the same server.

A better solution would be to replace the files with a (file) database that can be merged on the server.
One solution is sqlite (if you want SQL db) or unqlite (if you want a Document store engine - better suited for WPT imho)

But I imagine that would take quite a bit of rewriting as well …

I don’t know if I’d say “better”. Having all of the results be stand-alone and completely stateless and not having a database dependency makes it a LOT easier to manage.