Uncategorized

Data war SQL vs noSQL vs newSQL

As the volume of data grows in cloud and add its characteristics the 5V volume,variety,variability,value,velocity.
The data war to grab pie of revenue this data for analysis,storage and insight is ongoing.
example: Oracle brought datastax (advertisement targeting company) to compete with cloud CRM vendor salesforce.com.
How much marketing or advertisement targeting getting linked to Customer relationship management in short span of time.But this is uppermost layer of fight to get pie of customer analytics/insights.
Most of this data is on nature 5V like facebook or twitter posts.
Read about 5V in my other blog post:(related links below):

Where and how this data is stored in underlying layers that’s were new fight is on. its RDBMS (SQL) vs noSQL vs newSQL.

RDBMS mantain data in ACID (Availability,concurrency,integrity,durability) properties. But is all scenario not all is desired. there may be some flexibility possible in A,C,I,D.
CAP theorem used in ll such senarios focus of database is to provide consistency,availability and partition tolerance.

Consistency( same version of latest data to all servers or clusters in realtime).
Availability (data is always available even in case of node failure)
Partition Tolerance ( even is partition is down system is able to recover data).

Now vendors offering CAP are stacked up
truth-of-cap-theorem-diagram

SQL/RDBMS vendors:
– able to offer ACID but not able to linear scale with vast volume of data.focus on consistency and availability.

NoSQL vendors:
able to scale with vast 5V data but not able offer full consistency or transaction capability. there is tradeoff between consistency,availability, and partition tolerance.
Like Cassandra focus on AP. but offer eventual consistency by other means of replicated clusters updating stale data.

NewSQL: offer scale and ACID. like voltDB.

basically as the load on db server grows as we add more server like from 1 to 5 to 10 the performance should also scale 1 to 5 to 10. but this is not possible in RDBMS as is scales but not linear it may achieve 1 to 2 to 4 as we add 10 servers or clusters like Real application clusters.
As we move into this vast volume of 5V of data the upper layer of analysis and datwarehosuing has already changed with Apache hadoop hive especially made datawarehsouse for unstructured data (read link below);

New SQL: able to offer ACID as well CAP but there is

Cloud Computing relation to Business Intelligence and Datawarehousing
Read :
1. 
http://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
2.
 http://sandyclassic.wordpress.com/2013/06/18/bigdatacloud-business-intelligence-and-analytics/

Cloud Computing and Unstructured Data Analysis Using
Apache Hadoop Hive
Read: 
http://sandyclassic.wordpress.com/2013/10/02/architecture-difference-between-sap-business-objects-and-ibm-cognos/
Also it compares Architecture of 2 Popular BI Tools.

Cloud Data warehouse Architecture:
http://sandyclassic.wordpress.com/2011/10/19/hadoop-its-relation-to-new-architecture-enterprise-datawarehouse/

Future of BI
No one can predict future but these are directions where it moving in BI.
http://sandyclassic.wordpress.com/2012/10/23/future-cloud-will-convergence-bisoaapp-dev-and-security/

 

Standard