Guide to NoSQL (part 3)

<< part 2

 

7      Graph DB

So what is the difference between object-oriented and graph-oriented databases? Both operate with an object’s topology and have to deal with complex relations issues. I found this interesting answer in InfoGrid blog:

Object and graph databases operate on two different levels of abstraction. An object database’s main data elements are objects, the way we know them from an object-oriented programming language. A graph database’s main data elements are nodes and edges. An object database does not have the notion of a (bidirectional) edge between two things with automatic referential integrity, etc. A graph database does not have the notion of a pointer that can be NULL.  A graph database is independent of the application platform (Java, C# etc). On the other hand, using a graph database you can’t simply take an arbitrary object and persist it.

 

Below are several examples of graph databases:

 

  • Bigdata® - a scale-out storage and computing fabric supporting optional transactions, very high concurrency, and very high aggregate IO rates.
  • Neo4J - an open-source / commercial graph database with an embedded, disk-based, fully transactional Java persistence engine.
  • InfoGrid - a Web Graph Database with many additional software components that make the development of REST-ful web applications on a graph foundation easy. InfoGrid is open source and is developed in Java
  • Infinite Graph - a highly scalable, distributed, and cloud-enabled commercial product with flexible licensing for startups.
  • Trinity - distributed in-memory graph engine under development at Microsoft Research Labs.
  • HyperGraphDB - a general purpose, open-source graph and Java object-oriented (hybrid) database based on a powerful knowledge management formalism known as directed hypergraphs.
  • AllegroGraph - a closed source, scalable, high-performance graph database.
  • OrientDB - a high-performance open source document-graph (hybrid) database.
  • VertexDB - high performance graph database server that supports automatic garbage collection.

In addition, we can’t finish graph databases without mentioning the new concept of the graph processing, called Pregel, which was created by Google (on top of the BSP Bulk Synchronous Parallel - computational model).

 

Claudio Martella describes Pregel as follows: Pregel is a system for large-scale graph processing. It provides a fault-tolerant framework for the execution of graph algorithms in parallel over many machines. Think of it as MapReduce re-thought for graph operations.

 

But what's wrong with MapReduce and graph algorithms? Nothing in particular, though it can lead to suboptimal performance because the graph state has to be passed from one phase to the other. generating a lot of I/O. It has some usability issues since it doesn't provide a way to do a per-vertex calculation. In general, it's not easy to express graph algorithms in M/R. Pregel fills a gap since there are no frameworks for graph processing that address both distribution and fault-tolerance.

 

Like M/R, it provides the possibility to define Combiners in order to reduce message passing overhead - it combines messages together where semantically possible. Like Sawzall (a procedural domain-specific programming language, used by Google to process large numbers of individual log records), Pregel provides Aggregators which allow global communication by receiving messages from multiple vertices, combining them and sending the result back to the vertices. They are useful for statistics (think of a histogram of vertex degrees) or for global controlling (for example an aggregator can collect all the vertices' PageRank deltas to calculate the convergence condition).

 

This framework inspired several projects to provide frameworks with similar APIs:

 

  • Giraph - a graph-processing framework which provides Pregel’s API  that is launched as a typical Hadoop job to leverage existing Hadoop infrastructure.
  • Apache Hama - a distributed computing framework based on BSP computing techniques for massive scientific computations, e.g., matrix, graph, and network algorithms.
  • GoldenOrb - a cloud-based open source project for massive-scale graph analysis, built upon best-of-breed software from the Apache Hadoop project modeled after Google’s Pregel architecture.
  • Phoebus - a Pregel implementation in Erlang.
  • Signal/Collect - a framework for synchronous and asynchronous parallel graph processing. It allows programmers to express many algorithms on graphs in a concise and elegant way.
  • HipG - a library for high-level parallel processing of large-scale graphs. HipG is implemented in Java and is designed for a distributed-memory machine. Besides basic distributed graph algorithms, it handles divide-and-conquer graph algorithms and algorithms that execute on graphs created on-the-fly.

8      CEP (Complex event processing)

Complex event processing is one of the most important parts in business decision making and real-time business analytics (the opposite from batch processing). The wiki defines CEP as following:

 

Event processing is a method of tracking and analyzing (processing) streams of information (data) about things that happen (events), and deriving a conclusion from them. Complex event processing, or CEP, is event processing that combines data from multiple sources to infer events or patterns that suggest more complicated circumstances. The goal of complex event processing is to identify meaningful events (such as opportunities or threats) and respond to them as quickly as possible.

 

CEP relies on a number of techniques, including:

 

  • Event-pattern detection
  • Event abstraction
  • Modeling event hierarchies
  • Detecting relationships (such as causality, membership or timing) between events
  • Abstracting event-driven processes

 

CEP is primarily focused on operational data and dealing with operational issues, but also is connected to ex-post analysis of event data and enables efficient archiving and query valuable event data for later analysis.

As was explained by Hienz Roth in his work - the knowledge that could be gained from event-data analysis can help to adapt and improve the underlying CEP application.

 

An Event Data Warehouse (EDWH) is a subject-oriented, integrated, time-variant and non-volatile collection of event data in support of operational or management's decision-making process. The EDWH stores data items which originate from a continuous stream of events and stores the following types of data items:

 

  • Data items representing the original events in event streams
  • Data items capturing relationship information for event sequences resulting from event correlations
  • Derived or calculated data items from events or event streams.

 

An EDWH supports a query language for accessing all three types of data items. In order to analytically process event data ex-post, it has to be permanently archived somewhere. The common solution for this problem is to make use of a data warehouse (DWH) approach where data is added periodically in a batch using an extract, transform and loading (ETL) process.

 

An additional definition for these kinds of systems is known as active data warehousing. It is described as follows:

An active data warehouse (ADW) is a data warehouse implementation that supports near-time or near-real-time decision making. It is featured by event-driven actions triggered by a continuous stream of queries (generated by people or applications) against a broad, deep, granular set of enterprise data.

 

The following several examples of such CEP systems:

 

  • StreamBase – the commercial version of the project called Aurora set up by Mike Stonebraker at MIT, in conjunction with researchers from Brandeis University and Brown University. StreamBase's web site says that its complex event processing (CEP) platform allows for the rapid building of systems that analyze and act on real-time streaming data for instantaneous decision-making, and combines a rapid application development environment, an ultra-low-latency high-throughput event server, and connectivity to real-time and historical data.
  • Sybase Aleri Event Stream Processor – the complex event processing platform for rapid development and deployment of business critical applications that analyze and act on high velocity and high volume streaming data – in real-time.
  • Esper -  a component for complex event processing with Event Processing Language (EPL) for dealing with high frequency time-based event data. Esper supports storing the state to Apache Cassandra.
  • Yahoo! S4 (Simple Scalable Streaming System) - general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data.
  • Twitter Storm -  a free and open source distributed realtime computation system.
  • Apache Kafka - a messaging system that was originally developed at LinkedIn to serve as the foundation for LinkedIn's activity stream and operational data processing pipeline.
  • RuleCore CEP Server - a Complex Event Processing server for real-time detection of complex event patterns from live event streams. A system level building block, providing reactivity into an event driven architecture
  • Splunk - software to search, monitor and analyze machine-generated data by applications, systems and IT infrastructure at scale via a web-style interface. Splunk captures, indexes and correlates real-time data in a searchable repository from which it can generate graphs, reports, alerts, dashboards and visualizations.

 

9      Column Family Databases

Wikipedia defines column family databases (CFDB) as follows:

 

A column family is a NoSQL object that contains columns of related data. It is a tuple (pair) that consists of a key-value pair, where the key is mapped to a value that is a set of columns. In analogy with relational databases, a column family is as a "table", each key-value pair being a "row". Column family represents how the data is stored on the disk. All the data in a single column family will be located in the same set of files. A column family can contain columns or super columns. Each column is a tuple (triplet) consisting of a column name, a value, and a timestamp. In a relational database table, this data would be grouped together within a table with other non-related data. A super column is a dictionary; it is a column that contains other columns (but not other super columns).

 

Please don’t mix up column-oriented MPP systems (like Vertica) with column-family databases. The following is Greg Linden’s post about C-Store (Vertica’s prototype):

 

CFDB is also column-oriented and optimized for reads. It is designed for sparse table structures and compresses data. It has relaxed consistency. It is extremely fast. There are some big differences:

 

  • SFDB is not designed to support arbitrary SQL; it is a very large, distributed map.
  • CFDB emphasizes massive data and high availability on very large clusters, including cross data centers clusters.
  • CFDB is designed to support historical queries (e.g. get data as it looked at time X).
  • CFDB does not require explicit table definitions and strings are the only data type.

 

The following several examples of such CFDB systems:

 

  • Google BigTable - a compressed, high performance, and proprietary data storage system built on Google File System, Chubby Lock Service, SSTable and a few other Google technologies. It is not distributed outside Google, although Google offers access to it as part of its Google App Engine.
  • Hypertable - an open source database inspired by publications on the design of Google's BigTable. Hypertable runs on top of a distributed file system such as the Apache Hadoop DFS, GlusterFS, or the Kosmos File System (KFS). It is written almost entirely in C++.
  • Apache Cassandra - a NoSQL solution that was initially developed by Facebook and powered their Inbox Search feature until late 2010. Jeff Hammerbacher, who led the Facebook Data team at the time, described Cassandra as a BigTable data model running on an Amazon Dynamo-like infrastructure
  • Apache HBase (Hadoop) - provides Bigtable-like capabilities on top of Hadoop Core.
  • Apache Accumulo - sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system. Apache Accumulo is based on Google's BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift.
  • Cloudata - open source implementation of Google's BigTable.

 

10 In Memory Data Grid

In Memory Data Grid (IMDG) decouples RAM resources from individual systems in the data center or across multiple locations, and then aggregates those resources into a virtualized memory pool available to any computer in the cluster. IMDG uses key-value structure, where value is object-based. IMDG uses shared-nothing scalability architecture where data is partitioned onto nodes connected into a seamless, expandable, resilient fabric capable of spanning process, machine, and geographical boundaries.

 

In many cases a user will be able to configure ACID support, similar to some key-value databases. To better understand IMDG solution, let’s take a look at the GemFire architecture diagram:

Gemfire.jpg

 

 

In this diagram we can see how it could be possible to connect 3 different datacenters across the globe into one coordinated virtual environment with shared data.

Let’s review the main distinguishing features of IMGD by GemFire:

  • Wide Area Data Distribution - GemFire's WAN gateway allows distributed systems to scale out in an unbounded and loosely-coupled fashion without loss of performance, reliability and data consistency.
  • Heterogeneous Data Sharing - C#, C++ and Java applications can share business objects with each other without going through a transformation layer such as SOAP or XML. A change to a business object in one language can trigger reliable notifications in applications written in the other supported languages.
  • Object Query Language (OQL) - a subset of SQL-92 with some object extensions, to query and register interest in data stored in GemFire.
  • Map-reduce framework – a programming model when application function can be executed on just one fabric node, executed in parallel on a subset of nodes or in parallel across all the nodes.
  • Continuous Availability - in addition to guaranteed consistent copies of data in memory across servers and nodes, applications can synchronously or asynchronously persist the data to disk on one or more nodes. GemFire's shared-nothing disk architecture ensures very high levels of data availability
  • High Scalability - scalability is achieved through dynamic partitioning of data across many member nodes and spreading the data load across the servers. For 'hot' data, the system can be dynamically expanded to have more copies of the data. Application behavior can also be provisioned and routed to run in a distributed manner in proximity to the data it depends on.
  • Low and Predictable Latency - GemFire uses a highly optimized caching layer designed to minimize context switches among threads and processes.
  • Very High Throughput - GemFire uses concurrent main-memory data structures and a highly optimized distribution infrastructure, offering 10X or more read and write throughput, compared with traditional disk-based databases.
  • Co-located Transactions to Dramatically Boost Throughput - Multiple transactions can be executed simultaneously across several partitioned regions.
  • Improved Scale-out Capabilities - Subscription processing is now partitioned to enable access by many more subscribers with even lower latency than before. Clients communicate directly with each data-hosting server in a single hop, increasing access performance 2 to 3 times for thin clients.
  • Spring Integration and Simplified APIs for Greater Development Ease - Thanks to the Spring GemFire Integration project, developers will be able to easily build Spring applications that leverage GemFire distributed data management. In addition, GemFire APIs have been modified for ease of startup and use. The developer samples included with GemFire have been updated to reflect the new APIs.
  • Enhanced Parallel Disk Persistence - Our newly redesigned “shared nothing” parallel disk persistence model now provides persistence for any block of data&colon; partitioned or replicated. This enables all your operational data to safely “live” in GemFire, greatly reducing costs by relegating the database to an archival store.
  • L2 Caching for Hibernate - With L2 caching, developers can implement GemFire’s enterprise-class data management features for their Spring Hibernate applications. Highly scalable and reliable GemFire L2 caching vastly increases Hibernate performance, reduces database bottlenecks, boosts developer productivity, and supports cloud-scale deployment.
  • HTTP Session Management for Tomcat and vFabric tc Server - GemFire lets you decouple session management from your JSP container. You can scale application server and HTTP session handling independently, leveraging GemFire’s ability to manage very large sessions with high performance and no session loss. GemFire HTTP Session Management is pre-configured and can launch automatically with tc Server. For Tomcat, the module is enabled via minor configuration modifications.

 

Below are several examples of IMDG systems:

 

  • VMware vFabric TM GemFire ® - is a distributed data management platform providing dynamic scalability, high performance, and database-like persistence. It blends advanced techniques like replication, partitioning, data-aware routing, and continuous querying.
  • GigaSpaces XAP Elastic Caching Edition - an in-memory data grid for fast data access, extreme performance, and scalability. XAP eliminates database bottlenecks and guarantees consistency, transactional security, reliability, and high availability of your data.
  • Infinispan - an extremely scalable, highly available data grid platform - 100% open source, and written in Java.  The purpose of Infinispan is to expose a data structure that is highly concurrent, designed from the ground-up to make the most of modern multi-processor/multi-core architectures while at the same time providing distributed cache capabilities.  At its core, Infinispan exposes a Cache interface which extends java.util.Map.  It is also optionally is backed by a peer-to-peer network architecture to distribute state efficiently around a data grid.
  • Hazelcast - an open source clustering and highly scalable data distribution platform for Java.
  • Oracle Coherence - replicated and distributed (partitioned) data management and caching services on top of a reliable, highly scalable peer-to-peer clustering protocol.
  • Terracotta BigMemory - stores “big” amounts of data in machine memory for ultra-fast access.

Summary

This overview of Big Data problems and NewSQL/NoSQL solutions aims to help you navigate the data management world, which is evolving at an insane pace. In addition, I hope it helped to remove the bias against non-standard solutions, and understand the trend of moving solutions to SaaS and Cloud environments.

Since we didn’t dive deep enough into any of the presented solutions, I recommend collecting real stories from different HP Software groups who are currently tackling big data problems. They may have solved the problems using one of the proposed solutions, or perhaps one that isn’t included in this blog.

 

References

 

[1] Wikipedia

[2] Urban Myths about SQL, Michael Stonebraker, 2011

[3] Clarifications on the CAP Theorem and Data-Related Errors, Michael Stonebraker, October 2010

[4] Towards Robust Distributed Systems, Eric A. Brewer, July 2000

[5] Map-Reduce

[6] List of NOSQL Databases

[7] Hadoop: What it is, how it works, and what it can do

[8] Event data warehousing for Complex Event Processing, Roth Heinz, May 2010

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
About the Author


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation