Google Apps
Relational to NoSQL - Migration
A primer for database design for Google Bigtable
Dec. 6, 2010 12:55 PM
NoSQL Databases & BigTable Revolution
In computing, NoSQL is a term used to designate database management systems that differ from classic relational database management systems in some way. These data stores may not require fixed table schemas, and usually avoid join operations and typically scale horizontally. Academics and papers typically refer to these databases as structured storage, a term that would include classic relational databases as a subset.
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance.
As evident Google's Bigtable represents one of the successful implementation of a NoSQL database, so it is only prudent to consider the advantages of NOSQL databases without being negative about their usage. In this context viewing the common Enterprise Applications which are typically data modeled using Relational Databases , from the point of view of NoSQL especially from Google Bigtable would be very useful to implement new scalable solutions for the enterprise in the Cloud.
BigTable Data Model Vs Relational Data Model
BigTable Data base design and data model significantly differs from the traditional relational databases in many a categories. The below table provides a quick comparison of the two.
|
Big Table NoSQL Data Model
|
Relational Data Model
|
|
Uses a Multi Dimensional Sorted Map as a Data Structure , Each value in the map is identified by a key combination of (Row, Column, Time Stamp)
|
Only two dimensions on Row and Columns
|
|
Each Row, Column combination can store multiple versions
|
Each Row, Column Combination can store only one version at any point of time
|
|
A table can have unbounded number of columns
|
Typically Tables have fixed number of columns
|
|
Arbitrary "columns" on a row-by-row basis
|
Columns are fixed per row and applicable to all rows
|
|
Column keys are grouped into sets called column families, which form the basic unit of access control.
|
No concept of Column Families
|
|
No Multi Row Transactions, Only single row transactions supported
|
Multi Row Transactions Supported.
|
Case Study : Migrating a Relation DB Model to NoSQL/BigTable
The following enterprise scenarios gives a good idea how a relational database design can be visualized to represent the same in the NoSQL/Bigtable design.
In this scenario, an enterprise stores information about it's employees and in a typical relational model the following tables will be used. This is a sample representation to explain the design principles of Bigtable and not necessarily exactly represent an Employee in an Enterprise which may have more attributes in real life.
- Employee Base Table (Basic Attributes of Employee)
- Employee Educational Qualifications (Child Table with a room to store 1:N Qualification Details)
- Employee Address (Child Table with a room to store multiple addresses for work, home etc...)
- Employee History (History of changes for the Employee in the organization over a period)
The below diagram gives a sketch of this data model in a traditional relational database design.
The following will be the design for this ER Model in a Bigtable / NOSQL Model, with the following salient features.

- Row key will be represented by Employee ID
- Column Family 1 : Basic with Columns (Name, DOB, Photo, Start Date)
- Column Family 2 : Address With Columns (Door, Street, City, State, Country)
- Column Family 3 : Education With Variable Set of Columns
- § (High School Degree, High School Institution, High School Marks, High School Passed Year)
- § Variable Columns (Graduate Degree, Graduate Institution, Graduate Marks, Graduate Passed Year)
- § Variable Columns (Masters Degree, Masters Institution, Masters Marks, Masters Passed Year)
- Column Family 4 : History with Column Job Title and which is Multi Versioned
The below diagram gives a pictorial representation of the data model under " BigTable / NOSQL" Model. Even in this small example we see lot of flexibility in data design and storage when compared to a relational database.

Summary
This simple example of a Employee entity may not be a correct candidate for a NOSQL database like BigTable however it gives the idea how a relational database design needs to be viewed in a NOSQL world . This design will be more applicable for unstructured content.
About Srinivasan Sundara RajanSrinivasan Sundara Rajan works at Gavs Technologies as a Chief Architect. His primary focus is enabling Agile Enterprises by facilitating the adoption of Every Thing As A Service Model with particular concentration on BpaaS (Business Process As A Service). Srinivasan is currently writing a series of articles on Indutry SaaS/BpaaS use cases which enterprises can adopt.All the views expressed are Srinivasan's independent analysis of industry and solutions and need not necessarily be of his current or past organizations. Srinivasan would like to thank every one who augmented his Architectural skills with Analytical ideas.