At Hitachi Data Systems (HDS), Hu Yoshida is the CTO responsible for defining the company’s technical direction. For everybody else, he is regarded as one of the most influential thought leaders in the storage industry today. Speaking to CNME, Yoshida sets the record straight on cloud and big data.
While the Middle East storage industry was booming in 2012, globally, it appeared to suffer. How are you looking back at HDS?
Generally, 2012 was kind of a tough year for the storage industry – I think there was a lot of consolidation going on. However, for us, it was record quarters one after another. I think we’ve had 12 record quarters in a row now and the last quarter we closed out at 3 percent year-on-year increase. The Middle East speaks another story, where we’ve seen tremendous, triple-digit growth. This is because of the currency problems in Europe, and the US is still coming out of that deep recession we had back in 2009. The economies here are much more robust than they are in developed regions right now and I think the Middle East is looking to diversify more to not be so dependent on oil.
One of the biggest storage trends of the moment is cloud, but why are enterprises not really embracing the public cloud so much?
One of the biggest concerns is about security, and the availability is still in question. You still hear at times about these cloud companies like Amazon and Google being offline for a while and that is a major consequence. The responsibility for a company still resides with the company – you can’t outsource responsibility. So maybe some of the infrastructure that is not as critical to the business can be outsourced to the cloud. There was also the thought that the smaller, mid-range customers may be outsourcing more because then they don’t have to set up their own infrastructure, but as we start to see this new trend toward the virtualisation of servers and more of the unified computing platforms, customers are finding that it’s easy to start to use more of this IT equipment in house.
Big data appears to be even further behind cloud – not just in adoption, but awareness too. Why is that?
There are several aspects of big data that people are confused of. One is this explosion of data. To address that, there are three parts to it. Today, we have customers approaching exabytes, so volumes are going to be so large that you can’t back it up, and you can’t extract, translate and load it into a data warehouse, so you’re going to have to bring the applications to that data. The second thing is velocity – how fast you can ingest the data, how fast you can search, and then how fast you can provision the infrastructure for that. That’s why you’re seeing more people looking toward these unified computing platforms, where you roll in a rack that’s already got the integrated servers, storage and network preconfigured and precertified. The third thing is variety of data – different types of data and how you see across. A lot of the value will be in the inner sections of different data and right now data is in silos. So those things are important and that’s what we’re building the infrastructure for. What big data really is about is how you get the value from all this variety of information, so it’s more about the analytics and deep expertise within the verticals.
Does this skyrocketing data warrant a new approach to an organisation’s enterprise storage, or is there a more affordable way to prepare for this?
Many people try to take the easy way. They say this means we need to have cheap commodity storage – dumb storage – and do everything with software. But I have a different view of that because, when you have cheap, dumb things, you have to find intelligence somewhere else, and that means it’s in the application or the server. That takes cycles, cycles take software, and software takes maintenance. So my feeling is to move more of that function down into the infrastructure and let the infrastructure take care of that automatically.
Do you find that CIOs have big budgets for these sorts of solutions and that this is high on their priorities?
Most of them right now don’t know what they need big data for. There’s a lot of confusion and misinformation around that. For us, we believe you need to build the infrastructure first, which heavily depends on virtualisation. So you can separate the changes in the infrastructure from impacting the application of the data. That’s why, at HDS, we are focused on virtualisation – not only as storage and a hypervisor, but also the server platforms themselves. You need to buy the infrastructure first then scale. If you buy cheap point solutions, you’re just setting yourself up for more problems in the future.
What one technology will have the biggest influence on the storage industry this year?
Flash storage is going to become more and more important. Up to now, flash has had three things wrong with it. One, it was very expensive compared to hard disk – normally 10 times the price – and SSD has a limited capacity of only about 400GB. The second thing is durability because they wore out very quickly. The third thing is that the performance varied because the SSDs were built with a single core processor because they were built for PCs and servers, not the enterprise storage market. So they have not invested a lot in the controller. What we are doing is building our own controllers. By doing that, we could put multi-core processors in there, do multi threading through there, increase the capacity to terabytes rather than gigabytes, and also reduce the cost relative to hard disk. So I think that will be big this year.