February 2020

We spoke recently with open-source time-series database maker InfluxData Inc. The occasion was the formal launch of InfluxDB Cloud — a database as a service — on Google Cloud. AWS support precedes this, and Microsoft Azure support is next up.
Launches like this are closely watched as the cloud database has become a proving ground for the future of the open source database. Highly visible wrangling between MongoDB and AWS, particularly, have placed the issue in bas relief.
The questions arise:

Will powerful cloud providers exploit open source loop holes to co-opt small startups database innovations?

Are the innovators, in the words of the old Sonny Boy Williamson song, simply “fattening frogs for snakes?”

Let’s put blues philosophy aside for the moment. Let’s look at the record.

For its part, Google has put increasing effort into closing the wide gap between itself and cloud leaders AWS and Microsoft. This mercantile motive has led to altruism that sees Google taking an interest in effectively partnering with open-source database wunderkinds.
This was clear last year as Google seized on AWS’s wranglings (with Elastic, MongoDB, principally) when it very visibly announced strategic partnerships with select open source database makers including InfluxData, MongoDB, Neo4j, Redis Labs and others. That is some of the background behind the February 4 InfluxDB Cloud on Google Cloud roll out.

For InfluxData CEO Evan Kaplan there is no question that the future of the database, including the InfluxDB time-series database, is in the cloud.
“The market has voted on open source databases,” he said. “We all recognize that the next generation of database applications are going to be founded on open source, and they will be offered on the cloud.”
Clearly, Kaplan and other opensource database company leaders walk the delicate but familiar line of “co-opitition” – working with big cloud providers to reach an audience, while also working to best the big players in ease-of-use, features and performance.
“We have to be able to compete with our technology on the cloud,” he told us.

Databases on the cloud are now somewhere in a multiyear evolution that is far from complete. But, with more and more data accruing to the cloud and the complexity of managing massive infrastructure growing, it makes sense when Kaplan says that the future of databases is in the cloud.
That could be especially true for time series data and analytics in the Industrial Internet of Things (IIoT). That is one of the sweet spots for InfluxData’s technology, Kaplan adjudges. Time series analytics have special currency in IIoT because the trends of blips of data over time disclose important information on how systems and their environments are changing. Such is part and parcel of the Industrial Internet of Things (IIoT).
While IIoT is a perhaps fancy term, it does stand for something. That is, the next generation of industrial systems that adjust their activity in real-time based on immediate analysis of data points.
Today industry looks at data through a new lens, with the purpose of improving operations. Some of these industries have been doing time series data analysis for quite a while. The speed of processing, the amount of data, the ease of programming are all elements that make this a promising technology area, and these are features vendors will focus on going forward.
As with the move from the mainframe to client-server architectures in the 1990s, this could be a telling moment for industrial systems. That is because the increasingly massive scale of processing on the cloud is accompanied by an architectural shift to serverless architecture.
Whether you select a big cloud provider or a smaller database player, the new architecture sees the advent of serverlessness. Database systems run as instances for customers – they run as instances on demand, like utilities (which, after all, was one of the first bits of nomenclature that rose up to describe what later became known as ‘the cloud’).
“InfluxDB Cloud is not tied to a server,” explains Kaplan. “It’s a serverless time-series platform built to collect, store and query processes, and to visualize rapidly ingested raw, high-precision time-stamped data.” The performance thereto is the secret sauce InfluxDB hopes will keep it a few steps ahead of the influential cloud players that dominate areas of computing today.

Other time-series specialists such as Graphite, TimeScale and others are in that hunt as well, while database stalwarts Oracle and IBM continue to tune their top offerings to bring time-series tech to a broader group of their customers. – Jack Vaughan

Archives for February 2020