Cloud Zone is brought to you in partnership with:

Dmitriy Setrakyan manages daily operations of GridGain Systems and brings over 12 years of experience to GridGain Systems which spans all areas of application software development from design and architecture to team management and quality assurance. His experience includes architecture and leadership in development of distributed middleware platforms, financial trading systems, CRM applications, and more. Dmitriy is a DZone MVB and is not an employee of DZone and has posted 57 posts at DZone. You can read more from them at their website. View Full User Profile

Distributed Caching in 5 Minutes

06.09.2014
| 6026 views |
  • submit to reddit

If you prefer a video demo with coding examples, skip to the screencast at the bottom of this blog.

Distributed In-Memory Caching generally allows you to replicate or partition your data in memory across your cluster. Memory provides a much faster access to the data, and by utilizing multiple cluster nodes the performance and scalability of the application increases significantly.


Majority of the products that do distributed caching call themselves In-Memory Data Grids. On top of simply providing hash-table-like access to your data, a data grid product should provide some combination of the following features:
  • Clustering
  • Distributed Messaging
  • Distributed Event Notifications
  • Distributed ACID Transactions
  • Distributed Locks
  • Distributed Data Queries, possibly using SQL
  • Distributed Data Structures, like Maps, Queues, Sets, etc.
  • Clustered Web Sessions
  • OR-Mapping Integration, including Hibernate
  • Persistent Database Support, like Oracle, MySQL, etc.
Of course the devil is in the details. For example, given the distributed nature of the cluster anything can fail at any point. So a good question to ask is how the failures are handled, especially what if the failures happen during commit. If during commit a cluster can be left in semi-committed state due to failures, it is definitely a problem.

Another example would be queries. Are the predicate queries being supported? Can you do SQL queries, particularly can the SQL Joins be handled? How are the aggregate functions handled, etc.

Simplicity of APIs is very important as well. ConcurrentMap API has become a de facto standard of accessing data stored in distributed caches, but not all the products support it. Also, a good thing to check would be whether other standard data structures are supported. For example, GridGain supports Map, Set, BlockingQueue, AtomicLong, AtomicSequence, CountDownLatch, all in distributed fashion.

And the last, but not least, always check for performance. Load up the cluster and see what the throughput and latencies are, what is the network load on each server, etc. A good benchmarking tool for testing distributed systems is open source Yardstick Framework, available on GitHub.

Coding Example

Here is a GridGain Data Grid coding example of some basic operations on distributed caches:

private static void atomicMapOperations() throws GridException {
    GridCache<Integer, String> cache = GridGain.grid().cache(CACHE_NAME);
 
    // Put and return previous value.
    String v = cache.put(1, "1");
    assert v == null;
 
    // Put and do not return previous value
    // (all methods ending with 'x' return boolean).
    // Performs better when previous value is not needed.
    cache.putx(2, "2");
 
    // Put asynchronously (every cache operation has async counterpart).
    GridFuture<String> fut = cache.putAsync(3, "3");
 
    // Put-if-absent.
    boolean b1 = cache.putxIfAbsent(4, "4");
    boolean b2 = cache.putxIfAbsent(4, "44");
 
 
    // Put-with-predicate, will succeed if predicate evaluates to true.
    cache.putx(5, "5");
    cache.putx(5, "55", new GridPredicate<GridCacheEntry<Integer, String>>() {
        @Override public boolean apply(GridCacheEntry<Integer, String> e) {
            return "5".equals(e.peek()); // Update only if previous value is "5".
        }
    });
 
    // Transform - assign new value based on previous value.
    cache.putx(6, "6");
    cache.transform(6, new GridClosure<String, String>() {
        @Override public String apply(String v) {
            return v + "6"; // Set new value based on previous value.
        }
    });
 
    // Replace.
    cache.putx(7, "7");
    b1 = cache.replace(7, "7", "77");
    b2 = cache.replace(7, "7", "777");
}

Screencast

Here is a brief screencast showing how to get started with basic operations on your cluster in under 5 minutes:

Published at DZone with permission of Dmitriy Setrakyan, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)