As organisations work to make Big Data broadly available in the form of easily consumable analytics, they should consider outsourcing functions to the cloud. By opting for a Big Data as a Service solution that handles the resource-intensive and time-intensive operational aspects of Big Data technologies such as Hadoop, Spark, Hive and more, enterprises can focus on the benefits of and less on the grunt work.
The advent of Big Data raises fundamental questions about how organisations can embrace its potential, bring its value to greater parts of the organisation and incorporate that data with pre-existing enterprise data stores, such as enterprise data warehouses (EDWs) and data marts.
The dominant Big Data technology in commercial use today is Apache Hadoop. It’s used alongside other technologies that are part of the greater Hadoop ecosystem, such as the Apache Spark in-memory processing engine, the Apache Hive data warehouse infrastructure, and the Apache HBase NoSQL storage system.
In order for enterprises to include Big Data in their core enterprise data architecture, adaptation of and investment in Big Data as a Service technologies are required. A modern data architecture suited for today’s demands should be comprised of the following components:
High-performance, analytic-ready data store on Hadoop. How can Big Data be speedy and analysis-ready? A best practice for building an analysis-friendly Big Data environment is to create an analytic data store that loads the most commonly used datasets from the Hadoop data lake and structures them into dimensional models. With an analytic-ready data store on top of Hadoop, organisations can get the fastest response to queries. These models are easy for business users to understand, and they facilitate the exploration of how business contexts change over time.
This analytic data store must not only support reporting for the known-use cases, but also exploratory analysis for unplanned scenarios. The process should be seamless to the user, eliminating the need to know whether to query the analytic data store or Hadoop directly.
Semantic layer that facilitates ‘business language’ data analysis. How can Big Data be accessible to more business users? To hide the complexities in raw data and to expose data to business users in easily understood business terms, a semantic overlay is required. This semantic layer is a logical representation of data, where business rules can be applied.
Previously, business users would have to query Hadoop directly, which is impractical, or request information from IT, which means waiting in a queue of reporting requests. A semantic layer enables business users to analyse and explore data using familiar business terms — without the need to wait for IT to prioritise requests. It also allows for the reuse of data, reports and analysis across different users, maintaining alignment and consistency and saving IT the effort of responding to every individual request on a case-by-case basis.
A multi-tenant Big Data environment. How can Big Data be accessed throughout the organisation, no matter where people sit? With widespread demand for analytics, organisations need to embrace a hybrid centralised and decentralised approach to data. This allows different teams to incorporate local data sets and semantic definitions while also accessing the enterprise data resources that IT creates.
This hybrid approach can be achieved with a multi-tenant data architecture. In this architecture, IT collects and cleanses data into a shared Hadoop data lake and prepares a central semantic layer and analytic data store from that data.
IT then creates virtual copies of the centralised data environment for different business groups, such as finance, sales, marketing and customer support. This way, IT keeps the authority in data governance and semantic rules, while business groups and departments can truly see the impact of their daily business activities against historical or corporate data stored in Hadoop.
User-friendly ways of consuming analytics. How can the experience of Big Data analysis be user friendly? A final consideration for the end-user delivery of Big Data is the form in which data will be represented. These data interfaces should meet the unique and individual needs of all users. This requirement includes providing highly interactive and responsive dashboards for business users, intuitive visual discovery for analysts and pixel-perfect, scheduled reports for information consumers.
While each style is unique, the best practice is to ensure that each interface is not a separate tool, so that creating, collaborating and publishing information is done with consistency and accuracy. This is only achievable through a semantic layer that ensures data values remain consistent, while data presentations might differ from one user interface to another.
Big Data is increasingly vital to the enterprise and a fundamental part of the enterprise data architecture. To tap its full potential, enterprises need to accelerate investments in technologies that efficiently and effectively analyse and store data. Cloud solutions for Big Data and analytics make that possible. With them, enterprises can position themselves well for future data growth, and in turn, excel in the ever evolving Big Data ecosystem.