FrontPage - Hadoop Wiki
Our application has a background scheduler which works fairly well for many tasks related to our application. But, I wonder if it would be useful to look into using a tool like hadoop or another implementation of map reduce for finer grained tasklets. Move as much of actual application work out of the HTML Request->Response cycle and into the background.
What gets me trulyl excited about tools like Hadoop and languages like erlang the ability to take advantage of new hardware easily when we upgrade as well as providing an easy path for expansion once it's decided that it's time to expand.
I've not had a lot of time to research hadoop, but my understanding is that it provides an open source framework to using the map-reduce concept popularized by google. Hadoop also provides a file system that is shared across all the nodes.
Hadoop is written in Java, but they do have an example of a mapreduce job written in python, compiled via jython and then executed on a cluster.
I also ran across an article which indicates that you can write hadoop jobs directly in python w/o the conversion step. Writing An Hadoop MapReduce Program In Python
Related link: Greenplum and Aster Data have both just announced the integration of MapReduce into their SQL MPP data warehouse products
Interestingly, I believe greenplum database is based on postgreSQL.
Our application has a background scheduler which works fairly well for many tasks related to our application. But, I wonder if it would be useful to look into using a tool like hadoop or another implementation of map reduce for finer grained tasklets. Move as much of actual application work out of the HTML Request->Response cycle and into the background.
What gets me trulyl excited about tools like Hadoop and languages like erlang the ability to take advantage of new hardware easily when we upgrade as well as providing an easy path for expansion once it's decided that it's time to expand.
I've not had a lot of time to research hadoop, but my understanding is that it provides an open source framework to using the map-reduce concept popularized by google. Hadoop also provides a file system that is shared across all the nodes.
Hadoop is written in Java, but they do have an example of a mapreduce job written in python, compiled via jython and then executed on a cluster.
I also ran across an article which indicates that you can write hadoop jobs directly in python w/o the conversion step. Writing An Hadoop MapReduce Program In Python
Related link: Greenplum and Aster Data have both just announced the integration of MapReduce into their SQL MPP data warehouse products
Interestingly, I believe greenplum database is based on postgreSQL.

