Disco
Very light weight tool for writing map-reduce code. Erlang handles all the scheduling, etc and you write your map and reduce functions in in python. Lighthttp is used for transferring data between the nodes as needed.
Disco would be similar to hadoop but is lighter weight and uses erlang and python instead of java.
One thing I'm curious about is how high availabilty might be achieved. Not that I'm using any really large clusters of machines. One of the things that makes google's setup so interesting is that they can easily handle a node dieing without skipping a beat. They can do that because their storage is built to be distributed and highly available as well as the job scheduling. Both layers easily handle a node dropping out.
Very light weight tool for writing map-reduce code. Erlang handles all the scheduling, etc and you write your map and reduce functions in in python. Lighthttp is used for transferring data between the nodes as needed.
Disco would be similar to hadoop but is lighter weight and uses erlang and python instead of java.
One thing I'm curious about is how high availabilty might be achieved. Not that I'm using any really large clusters of machines. One of the things that makes google's setup so interesting is that they can easily handle a node dieing without skipping a beat. They can do that because their storage is built to be distributed and highly available as well as the job scheduling. Both layers easily handle a node dropping out.


0 comments:
Post a Comment