Amazon to Host Scientific Data Sets

Amazon announced that they will be offering free hosting for several popular scientific data sets. I think this is a genius move for Amazon as it eliminates the cost of uploading your big data sets to the Amazon system. With the data sets already staged on their S3 system, all you need to do is fire up any number of compute nodes in EC2 and perform the necessary computations on the data. Amazon will allow you to create your own, private snapshot of the public data that you can compute on and store in a personal Amazon EBS volume. The following scientific data sets will be available initially:

Biology

  • Annotated Human Genome Data provided by ENSEMBL

Chemistry

  • A 3D Version of the PubChem Library provided by Rajarshi Guha at Indiana University
  • UGI Virtual Conformer Library provided by Rajarshi Guha at Indiana University. 80GB of data in SD format on conformers for 500,000 molecules that can be used for virtual screening

It seems that they are open to suggestions for additional data sets so please send them your suggestions.

Comments

Costs

Can anybody comment on the costs of using this facility?

The coolest thing is that

The coolest thing is that Amazon provides you free access to the datasets. You mount the dataset as a storage block to your EC2 instance and data transfer between those two is free. You only have to pay for the use of your EC2 instances. In other words, access is free, you only pay for the computer.

More info EC2, including pricing can be found here: http://aws.amazon.com/ec2/

To get a better idea you can use their calculator to estimate the cost of different scenarios:
http://calculator.s3.amazonaws.com/calc5.html

Mathematica's Curated data

It's worth noting in this context Mathematica's direct integration of an expanding set of curated data sets.

http://reference.wolfram.com/mathematica/guide/NewIn70ComputableData.html

and also see the link to "Computable Data" on

http://reference.wolfram.com/mathematica/guide/Mathematica.html

David Reiss
http://Scientificarts.com/worklife