Announcing release of HadoopDB

pedalpete · on July 21, 2009

Cool that these guys have built a tool/stack to implement a complete hadoop/postgre layer (if I understood the article correctly).

But it brings up the question... Why is data and data processing outstripping hardware capabilities at such an alarming rate? Is this whole non-relational database performance the right direction? or should we be focusing on new hardware solutions?

icey · on July 21, 2009

Can someone with some experience with Hadoop tell us if this is a big deal or not?

I'm inclined to think that it is, but I only have the press release to judge by.

vicaya · on July 21, 2009

It's poor man's Vertica. Mostly good for analytics workloads.

It's quite strange that they didn't reference Bigtable paper at all, while saying "to the best of our knowledge, there exists no published deployment of a parallel database with nodes numbering into the thousands". Google had a dozen bigtable clusters with more than 500 nodes and at least one cluster with a few thousand nodes (for the main crawl db), more than 3 years ago.