Archiving results in private instance

Hi Pat,

I’ve just update to 2.6. I had a little trouble with it (specifically waterfall charts) but figured out that I needed to update my webserver’s PHP to 5.3. Works like a charm now.

Anyway, just wanted to pick your brain on any solutions or ideas you might have on archiving or compressing results data while keeping everything in tact. I’ve collected over 50GB of data and would like to keep that historical data available to anyone that might need it.

Thanks in advance for your help.
Mike

If you can mount a remote filesystem to hold the archived data I have a php script that will archive any test older than 3 days to another directory. Each test gets stored as a single zip file and it is what I use on WPT to archive from the ssd’s where the current tests are kept to a disk array for long-term storage. The code knows how to automatically restore archived tests so it is completely transparent to end-users.

You could also go through and prune some less useful files to save space (like delete all of the screen shots except for the fully loaded one).

If you want to handle restoring manually if someone asks you could also just tar/zip the directories under results/ and delete them (they are stored by year/month/day).

Hi Pat,

I’m resurrecting this thread since I’m in need of it now. Would you be able to share this PHP script that archives results? If this script saves the data in a different folder, what changes do I need to make to the code so that when a request is made for this archived data, the data will be returned to the requester?

Thanks as always for your work on this product.

Here is the archiving script I use: http://webpagetest.googlecode.com/svn/trunk/www/webpagetest/cli/archive.php

There are a couple of settings in settings/settings.ini that control it:

archive_dir=d:/archive/
archive_days=1

It will archive any tests that are older than archive_days days to the directory specified in archive_dir and then the regular code knows how to automatically restore tests if they are accessed.

I usually run it in an hourly cron but you can just run it manually if you don’t need to do it frequently.

Hi Pat. Thanks for the response. It looks like its already included in the private instance package. I just started the script and it looks like its working.

Thanks again for your help.

Hi Pat,

I’m experimenting with archiving tests to Amazon S3. I’ve tried running /cli/archive.php but the script is throwing a fatal error when archive.inc calls the class ZipArchive. I can’t seem to find this class defined anywhere. Judging from the includes across the code it should be in common.inc? I must be missing something obvious.

Any pointers you can give on this missing class and the general method and stability of archiving to Amazon S3 would be appreciated.

Running release 2.9

Cheers,
Mark

ZipArchive is part of the zip module for php (install will depend on which OS you are running on).

Archiving to S3 should work fine but it could get expensive over time. I have some systems that archive to Google’s storage for developers which is equivalent and uses the same APIs and it works great.

Thanks Pat,

ZipArchive up and working now. What is the relationship between the “archive_dir” and “archive_s3_*” settings?

archive.php won’t run unless “archive_dir” is defined but then in the ArchiveTest function if “archive_dir” is defined it will try and move to a local directory rather than push to S3?

I’ve got it up and running by taking out the check for “archive_dir” in the ArchiveTest function but I wonder if there needs to be another setting that switches betweens using local and remote archiving rather than using “archive_dir” as the trigger.

Sorry, just updated the code so that it also runs when ‘archive_s3_server’ is defined (which is the same checks that the inner archiving code uses). If you grab the latest from SVN it should work better.