NoSQL offers scalability and flexibility, users report

Users of NoSQL databases and data processing frameworks such as CouchDB and Hadoop are deploying these new technologies for their speed, scalability and flexibility, judging from a number of sessions at the NoSQL Now conference being held this week in San Jose, California.

“EMC is using a mixture of traditional databases and newfangled NoSQL data stores to analyse public perception of the company and its products,” explained Subramanian Kartik, distinguished EMC engineer.

“The process, called sentiment analysis, involves scanning hundreds of technology blogs, finding mentions of EMC and its products, and assessing if the references are positive or negative, using words in the text,” he said.

“To execute the analysis, EMC gathers the full text of all the blog and Web pages mentioning EMC, and compiles them into a version of MapReduce running on its Greenplum data analysis platform. It then uses Hadoop to weed out the Web markup code and non-essential words, which slims the data set considerably. It then passes the word lists into SQL-based databases, where a more thorough quantitative analysis is done,” added Kartik.

“The NoSQL technologies are useful in summarising a huge data set, while SQL can then be used for a more detailed analysis,” Kartik said, adding that this hybrid approach can be applied to many other areas of analysis as well.

“There is all sorts of information out there, and at some point you will have to go through tokenising, parsing and natural language processing. The way to get to any meaningful quantitative measures of this data is to put it in an environment you know can manipulate it well, in a SQL environment,” Kartik said.

For digital media company AOL, NoSQL products provide speed and volume that would not be possible using traditional relational databases.

“The company uses Hadoop and the CouchDB NoSQL database to run its ad targeting operations,” said Matt Ingenthron, manager of community relations for Couchbase.

AOL has developed a system that can pick out a set of targeted ads for each time a user opens an AOL page, according to Ingenthron. “What ads are chosen can be based on the data that AOL has on the user, along with algorithmic guesses about what ads would be most of interest to that user. The process must be executed within about 40 milliseconds,” he said.

Source data is voluminous. Logs are kept on all users’ actions on every server, AOL said. “They must be parsed and reassembled to build a profile of each user. The ad brokers also set a complex set of rules of how much they will pay for an ad impression, or what ads should be shown to which users,” Ingenthron said.

He added, “This activity generates 4 to 5 terabytes of data a day, and AOL has amassed 600 petabytes of operational data. The system maintains more than 650 billion keys, including one for every user, as well as keys for handling other aspects of data as well. The system must react to 600,000 events every second.”

“Data feeds produce much of this source data, which come from Web server logs and outside sources. The Hadoop Flume component is used to ingest data. The Hadoop cluster also executes a series of MapReduce jobs to parse the raw data into summaries,” explained Ingenthron.

AOL also uses Couchbase’s CouchDB as a switching station of sorts for data arriving from the feeds, according to the company. “Because CouchDB can work with data without writing it to disk, it can be used to parse data quickly before sending it to the next step,” pointed out Ingenthron.

“We didn’t anticipate ad targeting to be a primary [market] for us. But Couchbase ended up filling a need for AOL and other ad companies,” Ingenthron said. The work is “technically complex and has a lot challenges in processing data very quickly.”

Scientific and medical publishing house Elsevier was looking for greater flexibility when it procured an XML-based, non-relational database system from Mark Logic, according to Elsevier Labs VP, Bradley Allen.

“The scientific publishing world is moving from a static model to a more dynamic one,” Allen explained. For the past few centuries, printed scientific paper, collected in journals, served as the basic unit of knowledge. It contained a description of the work, the authors and contributors, references and other core components of information. While the scientific publishing world is moving to digital, paper remains the dominant medium for data communication. “We’re still in the horse-and-carriage era,” Allen quipped.

“Over time, the scientific paper will be decomposed into individual elements, which can be used in multiple products. Individual paragraphs or even individual assertions can be annotated and indexed,” Allen predicted. “They can then be reassembled into new works and embedded in applications, such as programs that doctors can consult. They can also be mined for new information through the use of analytics.”

With this in mind, Elsevier is in the process of annotating the papers in its journals so they can be deployed in other applications and services. “An XML database was a natural fit for this work,” Allen explained. “New content types can easily be added into a database, and the format allows individual components to be easily reused in new composite applications and services,” he added.

Elsevier has introduced a number of new products with this approach. One is the SciVal, a service for academic administrators that summarises the publishing activity within their institution, giving them a quantitative idea of the organization’s academic strengths and weaknesses, he reported. Another service is the Science Direct, a full-text search engine for Elsevier’s journals.

Hefring Marine launches new app for comprehensive fleet management

MoIAT, e& ink MoU to empower ICV-certified small and medium-sized enterprises

TAMM wins best E-Government project at United Nations-backed WSIS Prizes 2025

Progressive regulation and zero tax policy drive UAE’s $34 billion crypto boom

Emirati entrepreneurs learn, sell, and grow in a digital world, says new GoDaddy data

MFTA launches Saudi Chapter, co-chaired by Mona Alsemayen and Sophie Guibaud

Cluster 2 signs agreement to advance smart airport operations in Saudi Arabia

Nokia drives cloud-native, AI-driven, secure networks for hyperconnected world

Belkin unveils new gaming portfolio featuring power-packed charging accessories, gaming essentials

Smart security adoption rises in Saudi homes with a digital-first approach

Microsoft AI Tour showcases groundbreaking AI innovations for Oman

Open Innovation AI collaborates with Intel to revolutionize AI orchestration with Gaudi

KROHNE delivers insights to inspire the next generation of engineers in Oman

Oracle supports major project to accelerate Oman digital economy

Ooredoo accelerates cybersecurity in Oman with new deal

Open Innovation AI collaborates with Intel to revolutionize AI orchestration with Gaudi

BDB launches “tijara” platform for SMEs

Bahrain achieves full nationwide 5G coverage

Batelco, SonicWall launch integrated security solutions for SMEs in Bahrain

Bahrain to offer COVID-19 test results on WhatsApp, Facebook Messenger

Open Innovation AI collaborates with Intel to revolutionize AI orchestration with Gaudi

Infopercept opens its first Middle East office in Kuwait

Microsoft Compliance Manager now available in Kuwait

Commercial Bank of Kuwait gets mobile payments moving with Thales Digital Solutions

Ooredoo chooses Fortinet to deliver secure SD-WAN managed services in Kuwait

Dubai’s Omining unveils first African site in Kenya’s Special Economic Zone

Rise of Fearless unites 2,500+ gamers through African heritage, battle royale

Rise of Fearless launches $700K investment round to advance Web3 mobile gaming in Africa

e& enterprise and RAIN Technology to revolutionise Operating Room efficiency in hospitals across MEA

Open Innovation AI collaborates with Intel to revolutionize AI orchestration with Gaudi

NTT DATA launches AI powered software defined infrastructure services for Cisco

Belkin unveils new gaming portfolio featuring power-packed charging accessories, gaming essentials

TwitchCon 10th anniversary brings new products and language expansion

Dynatrace drives real-time AI governance, data sovereignty in enterprise landscape

UAE takes lead in AI-driven digital transformation with Dynatrace’s Observability vision

EU expresses interest in developing AI gigafactories

UK operators seek to connect rural areas

U.S proposes ban on Chinese AI models

China responds to Taiwan’s tech blockade

U.S. on alert for Iranian cyberattacks

American University of Sharjah, Ghaf Labs partner to boost student industry exposure

emt, DESC to deliver Cyberspace Leaders Program 2025

MBZUAI’s MAILIS, AD Gaming to spotlight AI’s role in future of game development

ASUS examines the use of AI in Education at ‘The Tech Social’ Event

UAE sets pace in GenAI-powered upskilling and inclusive digital transformation

Amazon enters nuclear energy partnership to power data centres

DOE inks agreement with Presight, AIQ for AI solutions and digital transformation

Solis poised to transform Dubai’s skyline and deserts into beacons of sustainability

Open Innovation AI collaborates with Intel to revolutionize AI orchestration with Gaudi

Huawei launches ground-breaking solar inverter at World Future Energy Summit

American University of Sharjah, Ghaf Labs partner to boost student industry exposure

American based insurance giant suffers cyber breach

Qi, K2 Integrity join forces to align Iraq’s financial sector with global standards

ruya unveils AI-generated brand film: “You’ve Got Better Things to Do”

MENA Fintech Association launches Türkiye Chapter in collaboration with Insha Ventures

MBZUAI’s MAILIS, AD Gaming to spotlight AI’s role in future of game development

China introduces stricter online control with internet ID

OpenAI enters into lucrative deal with U.S. government

Trump launches smartphone mobile service

Trump-Musk feud leads to reevaluation of SpaceX contracts

Aster Clinics introduce Smyl AI – UAE’s first AI dental tool

Genomics company fined over data breach

SandboxAQ improves drug discovery with data creation

Aster DM Healthcare recognised for Workplace quality

DHA to leverage AI-powered ‘Genesys’ system in contact centre services

DLD boosts transparency with AI-enabled real estate advertising governance

MBRHE and Beyond Limits AI MoU to enhance digital transformation

Huspy launches GCC’s first AI-powered mortgage chatbot to transform home financing

DLD, VARA collaborate to boost leadership in realty and virtual assets regulation

Open Innovation AI collaborates with Intel to revolutionize AI orchestration with Gaudi

Global second-hand smartphone market sees annual drop

Hushday enters UAE market with private luxury sales and steep discounts

Jacky’s Business Solutions unveils Agentic AI offering to accelerate GCC’s autonomous business future

Skills gap, data hurdles, and ethics key to unlocking AI in GCC retail, says Al-Futtaim

ASUS unveils latest ExpertBook P1 models

Google launches Flow & Gemini’s photo-to-video capability in Middle East North Africa

Hytera, WAFA ink contract to transform UAE’s energy sector communications

NTT DATA appoints director of Smart Manufacturing and Industry for MEA

du, Huawei renew partnership to accelerate Emiratisation and digital talent development

Emirates, Crypto.com ink MoU to integrate Crypto.com Pay as payment option