Claudio Cherubino's blog Life of a Googler


Free tickets and promo code for Progressive .NET Tutorials in London

It's been less than a month since I left London to move to the Bay Area and I already have a good reason to come back.

Skills Matter and the London .NET User Group are organizing the Progressive .NET Tutorials in London on September 5-7 and the agenda looks quite interesting, with two tracks and a lot of important names, including my fellow Googler Jon Skeet!

Unfortunately, I won't be able to attend the event but the good part of it (good for you) is that I have two free tickets to give away!

The first two people to comment this post will get a free ticket each, but please only participate if you really plan to attend the event.

Don't despair if you don't get one of the two free tickets, if you want to buy one you can use the promo code PROGNET50 when registering to get a £50 discount off the ticket price.


Dissecting the Google Developer Advocate Team page

One of the side projects I worked on at the end of last year is the Google Developer Advocate Team page, a web application that provides bios for all members of my team and allows to track the public events we are going to attend.

We'll probably end up open-sourcing the code but I've already got questions about the technologies adopted so I decided to write this post to explain some of the design choices.

The application is written in Java and runs on App Engine, which provides scalability and simple deployment and administration.

One of the main requirements when designing the application was that it had to seamlessly integrate into our existing workflow in order to be as easy as possible for Google advocates to insert their events. Internally we use Google Calendar to track our trips and speaking opportunities so it was straightforward to use the Calendar Data API to fetch data from a shared calendar.

A cron job periodically checks that calendar, parses new events to extract the relevant info (date, time, location, speakers, products, description) and updates the Datastore using JDO.

Advocates' profiles are stored in a Google Spreadsheet which we can internally update using a simple web form. The public page uses the Spreadsheets Data API to get the relevant pieces of information and display them.

The Google Developer Events Calendar page embeds a Google Map using JavaScript and uses the Geocoding API to map event locations to geographical coordinates that can be pinned to the map or searched for.

Want to see your face in the Google Developer Advocates page? We are hiring!


Using JetS3t to interact with Google Storage

Google announced a lot of cool stuff at Google I/O, including Google TV, Android Froyo and App Engine for Business but it was another project that caught my attention and it is Google Storage for Developers.

Cloud storage is a huge business and many successful online services such as Dropbox, SlideShare or Twitter already use Amazon S3 for their storage needs. The idea is simple: you store your files on Google's infrastructure and access them from anywhere, paying only for what you use.

Google Storage is a RESTful service so it is possible to interact with it through the standard HTTP methods (GET, POST, PUT and DELETE) as explained in the Developer's Guide but it is much easier to rely on the client libraries which hide the complexities of the raw protocol.
A Python library can be downloaded from the project website but Google Storage also understands Amazon S3 protocol so that any existing tool for S3 can work with Google Storage with minor (if any) modifications.

Among all the toolkits for Amazon S3, JetS3t is probably the best known and most widely adopted. It is an open-source Java suite that officially added support for Google Storage in the recently released version 0.8.0.

JetS3t is released under the Apache License 2.0, so it is free for commercial or non-commercial use and can be modified as needed. Its source code can be downloaded from the project page on BitBucket where you can also find the instructions on how to build it or contribute to the project.

Using JetS3t to write a Java application that interacts with Google Storage is very easy. It is possible to connect to the service, create buckets and upload/download objects with less than 10 lines of code:

// Google Storage login credentials are required to manage GS accouponunts.
GSCredentials credentials = new GSCredentials(ACCESS_KEY, SECRET_KEY);

// To communicate with Google Storage use the GoogleStorageService.
GoogleStorageService service = new GoogleStorageService(credentials);

// To store data in GS you must first create a bucket, a container for objects.
GSBucket bucket = service.createBucket("jets3t-test-bucket");

// Create a GSObject based on a local file
File localFile = new File("my_file.txt");
GSObject localObject = new GSObject(localFile);

// Upload the object to our test bucket in GS.
localObject = service.putObject("jets3t-test-bucket", localObject);

// Download the data object we just uploaded
GSObject remoteObject = service.getObject("jets3t-test-bucket", "my_file.txt");

The GoogleStorageService exposes all the methods to manipulate the cloud filesystem and only takes a set of credentials to be instantiated. Besides the basic set of commands presented in the code snippet above, there are other methods to manage Access Control Lists or to copy/move/rename objects.
The source code distribution also includes a complete code sample that covers all the functionalities described in the Google Storage Developer's Guide.

Are you already thinking about the next big thing that can be built on top of this service?


Google events in Europe

It's time again for a series of Google events in Europe, where you can learn about the newest technologies and have the chance to ask questions to Google engineers.

Google Developer Days 2010 will be held in Europe in November according to the following calendar:

Registration for these three events will open on September 22nd so save the date because the available seats usually run out very quickly!

Next week we'll be also having a different event in Spain, the Madrid DevFest 2010. Unfortunately, registration for this event is already closed and there's no way to request extra seats, as it is full booked.

I'll be presenting at all these event about the Google Apps APIs and the Google Apps Marketplace and I'll be glad to answer your questions on these topics.
Hope to see a lot of you there!


Spell checking with F#

Some months ago I have written an introductory article on F# for the "IoProgrammo" magazine (sorry, Italian only!) and now I have published a second article on the latest issue of the same magazine.

This new article covers more advanced topics and is focused on writing a basic spell checker that mixes together the functional and object-oriented programming paradigms.

The spell checking algorithm is implemented in functional F# and is based on the Jaro-Winkler similarity distance while the UI is WPF-based and written with OO code.

I hope you will appreciate the article and I'll be very happy to get any feedback from the readers.

IoProgrammo - February 2010

IoProgrammo - February 2010


Ruby outperforms Python with Project Voldemort

If you are performing some statistical analysis on a huge amount of data (try thinking about Twitter data) then the database can become a real bottleneck and that's the reason why the interest on the NoSQL movement is quickly growing.

One of the most popular distributed key-value store that tries to overcome this problem is Project Voldemort, an open-source project based on Amazon Dynamo and sponsored by LinkedIn that uses it for some high-scalability storage problems.

Project Voldemort is written in Java and also provides the developers with C++ and Python client libraries to access the store. One thing that (strangely) was missing is the support for the Ruby language, mainly because of the lack of a stable Google Prototocol Buffers implementation for this language.

There is however a gem called ruby_protobuf, that besides being in alpha release, turned out to be reliable enough for my purpose of porting Project Voldemort Python client library to Ruby.

The library I wrote is called voldemort-ruby-client and is now released under the Apache 2.0 License on Google Code, so it is absolutely free for you to experiment with it.

While writing the library I also ported the Python test cases to Ruby and I found the latter to be 3000 times faster than the former!
Does anybody have a suggestion for the reason of this outstanding improvement?

In my machine the Ruby client performs about 2 millions PUT (or GET) requests per second against the 6 hundreds of the Python client.
Is there anybody else willing to repeat the benchmark on his machine and publish the results?