Thursday, September 24, 2009

Cassandra DB articles

A compilation of popular articles (growing list) on Cassandra. (View a list of clients to access Cassandra)

Official wiki for Cassandra
http://wiki.apache.org/cassandra/
http://wiki.apache.org/cassandra/ArticlesAndPresentations

Most Popular articles
Cassandra (Bigtable+Dynamo) - Jonathan Ellis
What every developer should know about database scalability - Jonathan Ellis
Cassandra Project, Rackspace article - Jonathan Ellis
Up and running with cassandra - Evan Weaver
WTF is a SuperColumn? An Intro to the Cassandra Data Model - Digg's Arin Sarkissian
Looking to the future with Cassandra - Digg's Ian
BlueRunner: Building an Email Service in the Cloud - IBM's Jun Rao
Cassandra and Ruby: A Love Affair? - Engine Yard
Building Scalable Databases: Denormalization, the NoSQL Movement and Digg - Dare Obasanjo

Structured Storage System over a P2P Network - By Avinash Lakshman et al.
Data Presentations Cassandra Sigmod
Cassandra presentation at NoSQL(same more details)
Cassandra – A structured storage system on a P2P Network - Facebook Notes
Cassandra - E-Team Lecture, Faceboook Video
NoSQL - Cassandra Video

Google BigTable paper by Fay Chang et al.
Bigtable: A Distributed Storage System for Structured Data

Amazon Dynamo paper by Avinash Lakshman et al.
Amazon Dynamo paper

The phi Accrual Failure Detector by Naohiro Hayashibara
phi Accrual Failure Detector

Werner Vogels on distributed systems
Eventually Consistent - Revisited
Amazon Dynamo

Roadmap and interesting issues
Cassandra Roadmap
Proactive repair - merkle trees ?
Cassandra data model misconceptions, and their sources
Hadoop integration
Ingesting from Hadoop to Cassandra
Mailing list archives: cassandra-user@incubator.apache.org
org.apache.incubator.cassandra-user - Mark Mail
Cassandra User Mail Archive
fauna (Twitter's ruby client) documentation

Some interesting NoSQL articles
NoSQL debrief
Anti-RDBMS: A list of distributed key-value stores
Needle in a Haystack: Efficient Storage of Billions of Photos
NoSQL: If Only It Was That Easy
Quick Reference to Alternative data storages
Some Notes on Distributed Key Stores
Key Value Store List
Cassandra Vs CouchDB
NoSQL and the Relational Model: don’t throw the baby out with the bathwater
Why we migrated from mysql to mongodb
No to SQL? Anti-database movement gains steam
Should you go Beyond Relational Databases?
Adventures with Cassandra Distributed Database

Friday, August 21, 2009

Cassandra DB clients

Have been playing around with Cassandra for sometime. Cassandra is a hybrid of Dynamo and BigTable. More details on my experiences in a later post. This is just a placeholder to keep track of the growing list of Cassandra clients. Please let me know if you need to add one. (View a list of popular articles on Cassandra).

BTW I have tried only the Java Thrift interface. Works, but very basic and has to improve a lot. Other high level APIs are all evolving. Will comment as I get chance to try more. Twitter's fauna/cassandra (rb) and Digg's lazyboy (py) seems very promising. Looks like Digg has gone to production recently on Cassandra. Nodeta's scalandra (scala) was extracted from Flowdock.

Lowlevel:

Thrift
https://svn.apache.org/repos/asf/incubator/cassandra/trunk/interface/cassandra.thrift
(java, cpp, csharp, php, perl, rb)

Highlevel:

Java
Start writing :) Wait you should be using Scala (jdk7)?

Scala
http://github.com/viktorklang/Cassidy
http://github.com/nodeta/scalandra
http://github.com/stevej/cassandra_client_scala
http://github.com/jboner/akka

Python
http://github.com/digg/lazyboy

Ruby
http://github.com/fauna/cassandra
http://github.com/NZKoz/cassandra_object

Clojure
http://github.com/mattrepl/clojure-cassandra

Wednesday, June 10, 2009

Override default JAX-WS endpoint address in wsdl


((BindingProvider)port).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY, MyENDPOINT);

Monitor JAX-WS soap traffic request and response


System.setProperty("com.sun.xml.ws.transport.http.client.HttpTransportPipe.dump", "true");

Tuesday, June 09, 2009

Minimalistic CSV parser

List notes = Arrays.asList(note.split(","));

Wednesday, April 08, 2009

Google App Engine adds Java support

GAE adds much awaited Java support. If you want to try Java on GAE Sign up now. Access is limited to first 10000 developers. http://appengine.google.com/promo/java_runtime.

New in GAE :
  • cron support
  • database import
  • secure data connector
  • early java support
Features :
  • 100% Java standards based (servlet, jsp, war, web.xml)
  • Java Runtime support for 1.6 JVM
  • Eclipse plugin (use other ide or cmd line)
  • Integrated GWT
  • JPA JDO support for BIG Table
More details : http://googleappengine.blogspot.com/2009/04/seriously-this-time-new-language-on-app.html

Download SDK : http://code.google.com/appengine/docs/java/overview.html

Sunday, February 01, 2009

GDrive from Google

There has been a lot of chatter that Google might be introducing GDrive soon. GDrive is expected to let the user access their files from anywhere, anytime, from any device (desktop, web browser or mobile phone). Nothing new about the concept of Cloud storage, but coming from Google and typically with a large free allowance it might be interesting to watch the space.

Recently tried Zumo Drive which essentially adds a z: drive and seamlessly synchronizes all files to the Cloud. Files from the drive are available in the Cloud and can be synchronized to any device including the iPhone. There is also a web interface to manage the files in the cloud. The client application installed in the windows PC was using jetty (embedded java webserver) and hsqldb (in-memory/disk based small database written in java). Hmm... did you know you were letting them run a webserver and a database in your machine just by installing the client ?

There are many products already in the online storage space and GDrive is really a late entry. Microsoft LiveMesh, SugarSynch etc... Most of them support from anywhere to any device. Another interesting technology which has been around since 1996 and has been gaining more attention recently after the Cloud storage scene became hot is WebDAV.

I am getting a feeling that the next big thing "Semantic Web" is slowly shaping up.

Saturday, January 31, 2009

What's up with Google search results

Today morning All Google search results were being tagged as potentially dangerous. Seems like a bug or probably due to StopBadware.org being down. The issue seems to have been fixed by now. See the following snapshots. Oops !

Update : It seems the feed from StopBadware.org which Google checks against had an entry for "/" due to "human error" which essentially flagged all URLs as dangerous. Read the official explaination from Google and StopBadware.org.


View Screen shots :