At the AWS Summit in San Francisco today, public cloud infrastructure provider Amazon Web Services (AWS) announced the launch of Redshift Spectrum, an extension of AWS’ Redshift managed data warehousing service that enables querying on data that sits inside of the longstanding AWS S3 storage service.
The introduction of Redshift Spectrum will make certain types of queries on data more economical, because Redshift, which includes computing and storage capabilities, is a more complex and costly service especially for number crunching on lots of data.
As an example, Amazon chief technology officer Werner Vogels provided an example query that would run on an exabyte of data with the Apache Hive open-source data querying software. It would take five years and 1,000 nodes — that is, it would be quite expensive. With Spectrum it would take 155 seconds and a few hundred dollars, Vogels said.
Already Docomo, Time, Edmunds, Redfin, and Yelp are using AWS Redshift Spectrum, he said.
Competitors include startup Snowflake’s cloud data warehouse. Microsoft Azure and IBM’s public cloud also offer data warehousing services.
AWS introduced Amazon Redshift in 2012. S3 itself dates to 2006.