Requirements for Big Data Real-Time Analytics


To build a successful real-time business analytics system, you’ll likely need an in-memory database plus several other core technologies. Here’s a look at each potential piece of a real-time analytics solution.

Adopt In-Memory Databases
Through the use of in-memory databases, tables and key information are held in main memory (vs. traditional on-disk database systems). Working in main memory is much faster than writing to and reading from a file system, notes McObject. As a result, in-memory databases can perform an application’s data management function an order of magnitude faster than traditional databases, McObject asserts.

Embrace DRAM Memory
The DRAM memory is protected against power failure by battery or capacitance technologies. It typically uses high-speed flash to protect and recover or restore data, notes Wikibon.

Leverage Flash
All other data is held on high-performance flash technology. The reason: Any traditional disk or hybrid disk technology typically bottlenecks performance. Historically, flash innovations demanded a price premium. But consumer demand for flash continues to drive down costs across the board. Plus, scale-out flash array architectures allow physical data to be shared across many applications without impacting performance.

Go Parallel
All processing should be highly parallelized – with high-bandwidth, low-latency interconnects between processors, memory and flash technologies (where necessary).

Think Bigger
All of your metadata about the data should be held in DRAM to maximize performance.

Think Ahead
Make sure the system supports anticipatory fetching and processing, which enables faster access to supporting data from multiple data streams.

That Makes Sense
Logical sharing of single copies of data should be built into the system.

Know Your Options
Choosing an in-memory database requires plenty of research. Five potential options include Aerospike, IBM Blu, Microsoft, Oracle and SAP Hana.

A Closer Look: Aerospike
Aerospike is a flash-optimized, in-memory open source NoSQL key-value database. It handles very high volume streams of data. Enterprises with existing transaction applications would need to migrate them to Aerospike for real-time analytics integration.

A Closer Look: IBM BLU
IBM BLU is based on the standard IBM DB2 OLTP 10.5 offering. BLU Acceleration capabilities are designed mainly for “read-mostly” inline analytics. It can leverage SIMD (single instruction, multiple data) on IBM Power 7 or Power 8 chips to improve performance. APIs in DB2 can potentially ease migration from Oracle.

A Closer Look: Microsoft
Microsoft SQL Server 2014 now has an In-Memory OLTP extension. Integration with SQL Server means you can have both memory-optimized tables and disk-based tables in the same database, and query across both types of tables, Microsoft asserts.

A Closer Look: Oracle
Oracle’s database now offers an in-memory option (additional costs involved) for performing analytic queries in parallel. It integrates with high-availability options such as ORACLE RAC and Dataguard.

A Closer Look: SAP HANA
HANA stands for High-Performance Analytic Appliance. Generally speaking, SAP HANA’s greatest value is in providing specific operational reports within minutes – rather than days.

Source: Big Data: 14 Requirements for Real-Time Analytics