Big data has been a fascinating topic. What began as a buzz word, and quickly evolved into hype, is now becoming something that CIOs must take very seriously.
Yes, it’s nice to think of a solution which allows you to extract more value out of your information. It’s certainly clear what the benefits are.
But what first appeared as an almost flowery technology that looks and sounds good, but doesn’t quite match up to more serious CIO priorities, has now become just that. This comes down to one quite annoying aspect; inevitability.
This isn’t something that can be put off. Corporate data is growing at an astronomical level – 60 percent per year if we are to believe the figures – and, as such, big data has become unavoidable. Rather than just being something that provides benefits, it has now become something interfering with not only traditional network architecture, but also enterprise IT culture.
“For most, managing and analysing this data presents both a challenge and an opportunity,” says Samer Ismair, MENA Systems Engineer, Brocade Communications. “As data requirements grow, the network must enable solution performance, not inhibit it. The ability to find value in large and fast-moving data continues to be a focal point for enterprises seeking competitive advantage.
“These advancements in data analysis are enabled by the underlying network. Organisations that can effectively mine their fast-growing stores of data stand to gain a competitive edge. However, the enormous volume, velocity, and variety of such data often make it difficult to extract timely insight and context-based results from all data sources.”
While big data technologies tend to be data centre friendly, they can stress a corporation’s network and firewall configurations because of the ongoing need for each node to ensure they are still functioning with the other members of the cluster, according to Karthik Krishnamurthy, VP, Enterprise Information Management, Cognizant.
“Hadoop and NoSQL databases support the ability to segment operations across different dimensions,” Krishnamurthy says. “The range of database segmentation can include racks, networks or power circuits in a single data centre. Or on a broader scale, big data solutions can operate in multiple data centres and geographical regions with multi-data centre deployments.
“However, experience shows that big data solutions are generally confined to one location, which includes data feeds, storage and analytics. The analytics component of a big data solution might be shared or accessed across locations, but this generally involves smaller sized datasets as the base data need not to be shared. Data feeds represent the most critical network impact. Each source or feed needs to be analysed for expected volumes across the day time.”
Meera Kaul, Managing Director, Optimus, says the emphasis on a big data environment while deploying network architecture is actually a function of the industry vertical that the business operates in.
“Over the years, a lot of industry verticals have created for themselves petabytes of data that they have challenges with storing, processing and even making business sense out of,” Kaul says. “Increasingly, big data as a domain has moved away from just defining storage and processing of the data to increased monetisation of the information collected by analytical processing.
“The challenge for organisations in the next few years will be finding ways to better analyse, monetise, and capitalise on all these information channels and integrate them back into their business. This has led to increased importance of big data execution to network architectures for these business verticals as architectures move to define a strategy of doing more with less in a network converged environment.”
Big data deployment is more than just implementation of a new application or software technology. It means an entire new need for system design, management, access, use and administration policies, new network infrastructure design within data centres, and WAN-linking all mission critical transactions to a content environment.
In terms of a corporate network and the network engineering team, there are best practices that should be employed to tool up for a big data implementation.
“As soon as a big data project starts, engage the development team and ask about the proposed architecture,” Krishnamurthy says. “Determine if the end solution will fit within a single rack or is to be deployed across a single data centre. Ask if there is a need for the geographic distribution of data and does that involve multiple data centres. Work with the development team to discover what the data replication requirements are for the production application.
“The main deliverable is a topology map for storage and feeds. Once more information about the solution has been discussed, review your network topology and ascertain the number and types of interconnects that will be utilised. This is important because when the application is in development, it is likely to run on a local network with more than adequate speed and capacity. In a multi-data centre production rollout, the interconnect speed and capacity between data centres may become critical to the overall success of the deployment.”
Whilst highlighting Apache Hadoop and No-SQL databases as the leading big data technology in use today, Kaul also adds that No-SQL databases are typically part of the real-time event detection process deployed to inbound channels. They can also be seen as an aiding expertise behind analytical capabilities such as contextual search applications.
“These are only made possible because of the malleable nature of the No-SQL model where the dimensionality of a query evolves from the data in scope, rather than being fixed by the developer in advance. To run these, the network requirements needs re-evaluation of compute, storage, and network infrastructure,” she says.
Whilst preparing a network for big data may sound like a daunting project, SAP MENA’s head of business analytics, database and technology, Jason Bath, questions the cost of not doing it.
“Data is one of the most valuable assets any business has, and the more data that can be accessed and used, in all its sources and varieties, the more assets can be leveraged in order to accelerate existing business processes, innovate new business models, and anticipate market demands,” Bath says.
“As big data becomes the norm, and we start calling it just ‘data’, the ability to access it and analyse it effectively becomes a minimum requirement to stay relevant and competitive.”
Furthermore, with traditional network architectures proving a bottle neck for big data, often making related applications inefficient, many see it as a necessary investment.
“When implementing big data, it is not only a matter of network architecture but also requires a change in an enterprise’ IT culture,” says Steven Huang, Director of Solutions and Marketing, Huawei, Enterprise.
“With big data applications now moving towards wholly integrated storage, CPU, software, and networking service packages for end users, IT staff will need to work together as a seamless team using the same set of technical tools.”
With the rise of big data, Huang adds, it will become more crucial that IT culture changes so that technical silos start to converge.
To add to that necessity, with the rate of information growth exceeding Moore’s Law, the average enterprise will need to manage 50 times more information by the year 2020, according to Peter Ford, Managing Director, Cisco Consulting Services,
“The requirements of traditional enterprise data models for application, database, and storage resources, which help make sense of the data flood, have grown over the years, and the cost and complexity of these models have increased along the way to meet the needs of big data,” Ford says.
“One size no longer fits all, and the traditional model is now being expanded to incorporate new building blocks that address the challenges of big data with new information processing frameworks purpose-built to meet big data’s requirements. However, these purpose-built systems also must meet the inherent requirement for integration into current business models, data strategies and network infrastructures.”
However, whilst many organisations are starting to look into big data, actual implementation when it comes to preparing network architecture is scarce.
Business leaders in the region need to radically rethink their approach to storage if they are to cope and derive true value, according to Huang.
“Businesses that need reactive real-time information and predictive analytics on human behaviour or traffic movement have been the first to start implementing networks to support big data,” he adds.
Another important area that demands further preparation in the Middle East is investment and awareness around data scientists.
“It is generally expected that data scientists are able to work with various elements of mathematics, statistics and computer science, although expertise in these subjects are not required,” Bath says.
“However, a data scientist is most likely to be an expert in only one or two of these disciplines, and proficient in another two or three. It is very rare to find an expert in all of these disciplines. This means that data science must be practiced as a team, where across the membership of the team there is expertise and proficiency in all the disciplines.”