Saturday, May 31, 2008

Google's Android demos

AppleInsider | Google's Android demo shows app store, tweaks iPhone formulas

Some cool demos of google's upcoming phone software, android.

Kamaelia - Concurrency in python

Kaemelia Introduction

Some interesting things:

1. Similar in idea to unix pipes

2. Python based

3. Seems to be entertainment (radio streaming, chat, games) oriented.

4. Related (part of?) to the BBC.

5. Seems pretty simple to get up and running.

Might need to come back and play with this sometime.

Notes on a history of Erlang

Notes on A History of Erlang at Ted Leung on the Air

Excellent article talking about the history of erlang. Some highlights I found interesting:

Lisp and Smalltalk are cited as inspirations, but more for the implementation of the runtime than for any features in the language.



Erlang definately doesn't seem to have the quality that Lisp has of being able to mold the language itself to the problem domain. The language erlang, like most other languages in use today, is static or dead. Lisp has this interesting (and compelling) feature of being able to live and change as needs change. Yet, I have not yet found as compelling a solution to concurrency and stability as erlang seems to have provided. If shared nothing architecture is truly the only solution to massive concurrency in a general purpose language, then most languages in use today will be unable to scale easily as concurrency increases. But, the reality is that regardless at how bad a language is at a task, the inertia of millions of lines of code will increase the longevity of todays languages for quite a number of years. It just means that unless some genius comes up with a fantastic way do object oriented programming in todays languages and still take advantage of concurrency (history seems to prove this isn't possible), we programmers will be struggling w/ locks and threads for a long time. It may be that something as simple as google's MapReduce may be the answer after all.

Reliability

Erlang really champions the idea of having reliability primitives built into the language itself. In most systems, reliability is done via running many processes on multiple machines, so that if one process dies, requests can still be handled. This is how it is recommended that one scale squeak/seaside, Ruby on Rails, Django, even PHP. Is there a need for the fine grained reliability support that erlang provides? And can't a good developer provide 90% of this functionality in today's environments. Perhaps, but it can be hard to recover from crashes of threads in a nice way. Also, in most languages, failure has to be handled after the application is written. You only deal with failure once you've scaled past the single server scenario.

Conclusion

For me, erlang's two biggest strengths are the fact that it seems to have solved the issues of concurrency and reliability in massively concurrent scenarios. Yet, as a language, erlang isn't as compelling to me as languages like lisp and smalltalk. Ironic since both languages are quite old by today's standards. Yet they both are still teaching new tricks to the popular languages of today

Google Spotlights data center inner workings (Cnet)

Google spotlights data center inner workings | Tech news blog - CNET News.com

Some interesting discussions about how google does their thing.

They talk about the three core elements of google's infrastructure. GFS, BigTable and MapReduce.

GFS provides a fault tolerant file storage system that runs across many different machines. The file system handles failure of a node and copies any data that was on that node to other nodes.

Bigtable provides structured data services and runs on top of GFS. GFS is not a traditional relational database. To operate at the level that of parallelism that google does, you have to give up some of the strict semantics that traditional relational databases operate under.

Finally, they use mapreduce which is a methodology for writing software. Here is google's description.

The irony to me in all of this is that that Google did all of this incredibly parallel coding with traditional programming languages. This is most likely because their problem can map on to the mapreduce idea fairly simply. Not to belittle what they have accomplished. The scale of what Google has done is so far beyond what anyone else has done so far that it's mind boggling. But the core of what they do is surprisingly simple. I'm sure to make it all work has taken many thousands of man hours of development time, yet the idea is so simple that anyone can take it and apply it to their problem domain.

In my daytime job we have a web application that for the first years of it's incarnation primarily ran as a traditional web application. Request comes in, do some work, send request back. At some point we realized that there were long running tasks that couldn't be done easily this way, so a background scheduler tool was written to handle those jobs. As our application has evolved, I'm finding I'm writing more and more code that runs through this system. We have some things that limit our ability to easily scale as google has done, one being we are heavily dependant on a relational database and much of the jobs that run in the background hit the database pretty heavily. But, I'd love to explore what it would take to somehow work at making that part of the system more parallel and more robust. We actually handle failure fairly well, but we still have some dependancies between jobs that causes some "weird" things to happen periodically if jobs get run out of order. Either way, any web application of any level of sophistication will probably find itself wishing for a robust background tool something like mapreduce.

I also can't help wishing for access to GFS. Anytime you have more then one server and you have to deal with files in any way, you quickly start needing to handle shared files. We use linux servers and so far in the few instances where I have multiple servers that need access to the same files, we've used NFS. But, it seems like there is a real need for a fault tolerant file system that runs in the "cloud" to overuse an already over used term. Each server contributes file storage and can "serve" any chunks of a given file they contain. When another server needs a particular file, my understanding is they query the master server which tells them which chunk servers the pieces are on. Then from that point on, the server needing the data communicates with the chunk servers directly. Data is stored on multiple chunk servers so the loss of any server doesn't cripple the file system. Again, it seems such a simple idea in retrospect, yet there aren't many (any?) available file systems that provide these pieces of functionality. I've never been able to find out in my research, but I expect that they give up full posix file semantics and use a in house library for file access. It seems that there was an apach project trying to reproduce GFS. Might have to research that a little bit more.

BigTable is interesting though after having watched a google tech talk about big table I think it would be a challenge to use it in the applications I work on. It seems like much of what I do requires an update to a piece of data and then a response when it is done with a guarrantee that no one else can change that data at the same time. All of this has to happen quickly and so far it seems as if a relational db with ACID is the only real answer for most business software. Perhaps there are ways around this requirement. At least it's something to think about. The erlang database, mnesia, seems to have some of the same characteristics as bigtable does.

Tuesday, May 27, 2008

The Road we didn't go down

armstrong on software: The Road we didn't go down

"But if you don't have the time or energy, the fundamental problem is
that RPC tries to make a distributed invocation look like a local one.
This can't work because the failure modes in distributed systems are
quite different from those in local systems, ..."


Samsung 256 GB SSD

Samsung Develops World’s Fastest and Largest Capacity 2.5-inch, MLC-based (256GB) SSD with SATA II Interface SAMSUNG

No pricing as of yet, but people are guessing in the $5-6,000 price range. Probably be awhile before it is available for normal laptops/desktops. I'm amazed at how quickly they are progressing. They are a couple of generations away from the largest desktop hard drives in the 750G-1T range given that they can keep doubling capacity with each generation. So, your next laptop probably won't have a SSD, but in three years, who knows...

Saturday, May 24, 2008

Parsing's not just for languages anymore

Parsing's not just for languages anymore | LispCast

Another one of my current interests. Domain specific languages. Omega looks interesting.

OMeta in lisp

Scala versus Erlang

The Scala vs Erlang whirlwind at Ted Leung on the Air

Ted Leung does a wrap up of a recent "whirlwind" of debate surrounding scala and erlang. I've been watching erlang with a great deal of interest over the last months. I'm not very familiar with Scala, so it is interesting to hear how it compares. I must say that my early years of struggling with Java performance has forever tainted my view of java and the jvm even though many people make assurances that it is light years ahead of what it used to be. Regardless, this post is not about java performance.

To my simple minded approach, the argument boils down to this. Is it possible to bolt on the best of erlang (shared nothing, high availability, concurrent scaling, etc) to an existing infrastructure. One of Scala's (and clojure's) strong points is the ability to leverage the rich ecosystem of java libraries. Yet, one does wonder if those libraries are not "share nothing", will you not run into problem situations where two Scala actors try to do something in java libraries that step on each other's toes?

I've been having a similar discussion about Lisp and erlang. For the most part, Lisp seems the far more powerful language when compared with erlange. All things being equal, one would prefer to program in the more flexible language, ie Lisp. Among many other things, Lisp macros let you essentially add new features into the language itself. Scheme has better multi-processing primitives in it, but even common lisp can approximate much of what is done in erlang using continuations. But, here again is the catch. At some point, the programmer will need to use existing libraries to do additional work and at that moment they have stepped outside of the "shared nothing" paradigm since those libraries are not written using the same shared nothing primitives.

So, to me the essential question is this. Is it possible for a language to have the benefits of erlang without having it's primitives built into the lowest levels of the language and it's supporting libraries. Without that, anytime you step outside of the "safe" environment, you risk problems with mutable objects, global/class variables, etc. The impetus rests on the programmer to make sure they don't do something that would either break concurrency (locks, semaphores) or completely crash the application because two items try to mutate the same item concurrently.

To my mind, concurrency is not a feature that can be bolted on to an existing language/set of libraries and do much good. Yet, I actually expect a case of worse is better to win the day. There will be a huge push from the installed base of Java (and .net) environments to try and get much of the benefits of languages like erlang while maintaining the ability to do more traditional programming. Time will tell whether this will be possible or not.

Tuesday, May 20, 2008

Come on Netflix...


Open letter to Netflix,

What people want is so blindingly simple that there has to be a reason they don't provide it. Here is what people want. OK, actually, here is what I want. :-)

1. Use the exact same model as what you've got now. Allow me to instead of waiting for the DVD/Blu-ray disc in the mail, send a disk rip over the internet to the set top box I purchased for a one time fee.

2. Always be trickle streaming content, perhaps with the ability to say, hey between 10pm and 5am, knock yourself out and download at full speed. But the rest of the time, let me specify, hey use x amount of my connection to be downloading additional movies in my queue till my hd is full.

3. Download the next X items in the queue where X is limited by the internal hard drive. Even small laptop hard drives could handle quite a number of high definition movies.

4. Provide HD, widescreen and normal outputs as well as 5:1 surround sound.

5. Use the same viewing model. I can watch the "current" movie as many times as I want for as long as I want. When I'm done, it's deleted from the local hard drive and the next movie in the queue (previously downloaded) is immediately available.

6. Add the same limits we have today, ie if you have 2 a month, after you've watch 2 you have to wait till the month is over (or upgrade the account).

7. Finally, lock up the content in whatever way makes the content providers the happiest. Use hardware encryption, whatever it takes. The local box simply caches your chosen movies till you play them.

The benefit the consumer gets is a set top box that replaces or supplements their DVD and Blu-Ray players while getting the exact same experience. And, they don't have to deal with the problems of streaming movies where even on a high speed connection you get buffering and drop outs. Apple does this very well. They let you start watching immediately, but if you wait, the whole thing gets downloaded locally allowing disconnected watching.

I can think of three reasons why they are not doing this.

1. The content providers are worried that if any content is stored locally, it can be hacked. Yet, Apple has been able to make that deal. So, perhaps this isn't it.

2. People don't actually want the same experience they have with right now with mailed DVDs. Perhaps let the user choose any of the X items locally stored as their next movie. Not quite pay per view, but very cool. Especially since you picked the movies in your queue. (Updated: Apparently the roku does this. Any movie in your queue shows up in the list)

3. Bandwidth costs far exceed what they would make back withouth raising prices.

I actually think #3 is the real problem. They cannot stream that much video per person on a regular basis and keep up with the current cost structure. Bandwidth and server costs would instantaneously swamp any money they saved from postage costs.

So, if this is the case, then I would humbly suggest that they look at the bittorrent protocol. No, it won't work in every installation, but it should work in enough that it would significantly reduce load on their servers and make it more affordable and faster for everyone.

Still, at $99 the Roku might be worth the cost. If nothing else, they are at least giving a little bit of competition to Apple.

NetFlix: First Netflix Streaming Box Review, $100 and Unlimited Downloads!

Heh, apparently I'm

now part of the the Lazyweb. :-)


Coding Horror: Lazyweb Calling

Renting and downloading movies in the living room


New Netflix/Roku Set-top Streams Movies for Free - Technology News by ExtremeTech

Another option hooked up to the netflix service. I really like the netflix service for DVDs and we rarely ever go to a video store any longer. It's so much easier for us to put items in the queue when we hear about them rather then stand around the video store for 45 minutes trying to find a movie to watch. My wife and I alternate picking the next movie in the queue, so everybodies happy. Oh, did I mention no late fees? After a particularly bad experience with a local video store, we haven't been back since. Being able to watch a movie as many times as we want, stop it and restart it later, etc is huge. It used to be a mad rush to get movies back Sunday night or early Monday morning.

Now, the next logical step of course is a set top box that goes in your living room that just downloads the movies and lets you watch them. The apple tv has been doing this for awhile and is a tempting option. The only thing is, they don't have the idea of a queue of movies you'd like to watch someday and their whole "movie expires in 24 hours" after you started watching it is really annoying. But, it is quite handy to be able to move a movie onto a iphone, ipod or watch it from your computer.

I've played with the online video watching portion of netflix, but there are several critical shortcomings. One, you had to have a windows machine to use the player (of which we currently have none) and the catalog and video quality was very, very limited even with a very fast broadband connection. It is rare to find anything in our queue that you can watch this way. A set top box like the Roku solves the first problem nicely. But, until netflix can expand it's inventory of movies and bring the quality of them up to what you get with apple's rental service, we will probably be doing the netflix DVD rentals for quite awhile.

Facebook chat is developed in Erlang


Planet Erlang - Facebook chat is developed in Erlang

Erlang keeps popping up. It seems to excel in high load situations.

Now somebody just needs to create a mashup of erlang, lisp, seaside and Ruby on Rails. That would be pretty close to my ultimate development environment.

  • erlang - for the automatic server parallelization and ease of expanding across multiple backends.
  • lisp - for the domain specific languages and power of macros, et al
  • seaside - ease of debugging, continuation based web server
  • Ruby on Rails - Hmmm, MVC, integrated testing, RAD development
  • Integrated Flash/Flex or openlaszlo for gui interface?

Oh, and I'm probably forgetting a few others as well. Anyone ready to step up to the plate?

Saturday, May 17, 2008

Dash Express - Two way GPS


Dash Express review - Engadget

Though I don't have a need for a gps day to day, I always wish I had access to one when we travel a couple of times a year. The Garmin Nuvi's got really good reviews from Consumer Reports in a recent issue. But, I've been hearing good things about the Dash Express as well. Today I came across this engadget review of the Dash.

The Dash claims to be the first internet enabled GPS. One of the truly interesting things this does is allows them to in real time by monitoring traffic speeds of all the dash devices directly. So, the traffic data is much better then traditional GPS devices which rely only on road sensors and commercial vehicles. The network effects have great possibilities.

Another thing that is really interesting about this device is that it gives you choices of routes based on distance, time, etc. Also, because it's two way connected, it's easy to do a current search of gas prices, nearby restaurants, restrooms, etc. All things that are really useful when your on a trip. This is a situation where adding the internet to a device really increases it's usefullness. It's not just a gimic.

Now the negatives. Cost, the device isn't cheap ($399). In addition, you have to pay a monthly subscription fee to continue to receive updates to the device and take advantage of the internet connectivity. Also, it's not a small device, so it will take up some space.

All in all, I'd love to play with one of theses. Sadly it's really hard to justify the expense for the occasional need.

Dash website below. The demo video is interesting.

http://www.dash.net/

Do it yourself multi-touch hardware


Multitouch Goodness: Full-Screen Multitouch Mac OS X Is Here (But Not from Apple)

Highlights

  • the actual interface can be made with a piece of glass, a cardboard box and a webcam
  • They prototyped applications using flash as the front end.
  • The multi-touch code is in C, I believe.
These things are always really cool to look at. I'm sure we will see many uses for them. But, so far I don't think anyone has found a really useful way to work with a desktop computer with a multitouch interface. It works really well on the iphone with a few small caveats and I can easily see it working with a tablet style device. But it seems as if one would tire working with a desktop computer this way.

Anyway, the demo is pretty cool. Looks like a fun little project to play with.

Monday, May 5, 2008

Thursday, May 1, 2008

Where's the app?


Architecture astronauts take over - Joel on Software

Don't really have time to comment extensively on this, but thought Joel had some very pertinant points. The main one being that the average person doesn't care about the platform. They care about the killer application, what can it do for me. Once you have a compelling application, then people start wanting access to your technology, your platform per se.

We are seeing this with google right now. They built a compelling application, search, and only recently are they starting to talk about their architecture and opening it up for use by other programmers. Microsoft on the other hand has seen themselves as a platform vender. They build platforms for others to build applications on. Yet, one wonders where windows would be without Office. Office is their killer application. I would contend that without office, windows would not have been nearly as compelling an experience. So, Office is their killer app.

With windows Mesh, Microsoft is again having difficulty even expressing what mesh does, let alone why it's interesting. Hence you have many people who are just really confused as to what it is and what it does. As Joel suggests, it brings back echos of Hailstorm and .Net, both of which were/are kind of fuzzy as to what they are or what they do. So, what is the killer Mesh application?

Clojure


Introduction | Clojure

Lisp like language on the jvm. Listening to the screencast now:

Screencast


I'm curious how concurrency works in this environment.