 Led by former LinkedIn and Uber hands, Mountain View, California-based Star Tree looks to drive wider use of real-time analytical applications based around the Apache Pinot OLAP engine. This kind of technology has many uses in a world where great volumes of data arrive at ultrahigh velocity.
Led by former LinkedIn and Uber hands, Mountain View, California-based Star Tree looks to drive wider use of real-time analytical applications based around the Apache Pinot OLAP engine. This kind of technology has many uses in a world where great volumes of data arrive at ultrahigh velocity.
TECHNOLOGY EMPLOYED
If “OLAP” had marketing magic, that was a long time ago. OLAP was an early attempt to go beyond relational database and data warehouse limitations, but Apache Pinot today is probably better described in today’s parlance as a column-oriented data store, and its competition can come from any of the many databases to arise in recent years. Apache Pinot is designed to handle fast ingestion of data, and fast joins on users’ SQL queries. Since the StarTree focus is on cloud computing — its found in the three big cloud providers’ marketplaces – it can and also has been called a Database As A Service (DBaaS).
LEADERSHIP/BACKGROUND
StarTree is headed by founders and CEO Kishore Gopalakrishna, who worked with Apache Pinot while a senior engineer with LinkedIn, and Xiang Fu — also once of LinkedIn, and later part of the Pinot engineering effort while at Uber — now holding the title of Founding Engineer. Both men took part in the Apache Pinot open-source project.
Largely driven by Big Data era home-brewed efforts at LinkedIn and Uber, Apache Pinot bears resemblance in that to a host of Apache data processing tooling. LinkedIn replaced a sharded MySQL system with an early Pinot version, which gave immediate answers to questions on thousands of business metrics across thousands of dimensions. At Uber, it’s been used to execute complex SQL calls on fast incoming streams of data related to Uber’s vast service fleet.
THE ECOSPHERE
As veterans of the real-time event processing experience know, the possible uses for systems such as Apache Pinot are extensive. So-called event stream processing has a long lineage. Tibco, for example, has been at this since the 1980s. But those were different times and technology. Clearly, this Century’s upswing in open-source distributed processing frameworks has added new momentum to such software’s uses.
For its part, Star Tree seeks to be the premier provider of commercial Pinot, much as Confluent – not incidentally, also part of the LinkedIn diaspora — has done with the Apache Kafka streaming platform.
Like others it’s formed an ecosystem in the form of partnerships and alliances. According to your point of view, a mix of friends and frenemies can be discerned in a tiered StarTree ecosystem list of Confluent, Databricks, Datadog, Embeddable, InnoFrye, Kinetic Edge, Red Panda, Tableau (SalesForce division), and others.
StarTree has worked with a variety of companies, including Citi, Stripe, DoorDash, and Dialpad, to further the cause of Apache Pinot. It looks to broaden its footprint further, since its StarTree Cloud offering provides data analytics as a service on AWS, Google Cloud, and Microsoft Azure platforms. That is important since, as with other advanced data engines, Pinot is challenging for typical organizations to mount and run day-to-day, especially in comparison to long-established relational engines.
USE CASES
An analytical store like StarTree’s can be used for anomaly detection in security, performance metrics dashboards, recommender engines, real-time tracking of orders and deliveries, customer service and observability applications that track the internal state and connections of today’s complex systems.
Last year, StarTree showed signs that it would direct some efforts on observability applications, as it debuted StarTree Cloud Observability, featuring query support for metrics logs and traces. Company representatives suggest the observability space, as some leading all-in-one observability offerings have gained a reputation as expensive. Open Telemetry and other standards, it’s suggested, could mark an inflection point for disruption in the present observability status quo.
We were glad to speak recently with StarTree about the remarkable market segment they now ply. Our guests were Peter Corless, Director of Product Marketing for StarTree, and Chad Meley, Senior Vice President of Developer Relations and Marketing.
WHAT THEY SAY [EDITED FOR CLARITY AND BREVITY]
On the new forms of data analytics
Peter Corless: “We’re designed for real-time ingestion. These are distributed systems all co-operating together. These are clusters and clusters. It’s not all running in the same box.
“Yes, users can do these queries in their data warehouse, but it takes five minutes. A person trying to book an Uber doesn’t have five minutes to wait in the rain for their car. They want to know where that car is now. So, Uber added [H3] geospatial indexing on Apache Pinot to serve their purposes for real time analytics.
“You’re talking petabyte workloads, and 10,000, or 100,000 QPS [Queries Per Second]. And when you’re talking about ‘at scale,’ you want to have a reliable platform. We’re generally not a Greenfield. We are usually a Brownfield. Usually, people have burned through two or three databases before they finally say ‘Okay, we’re at scale. We need to be running on Pinot.’
On an observability market
Chad Meley: If you look back at the emergence of the whole observability market, the big innovation that launched all these companies was based on the fact that they figured out agents. If you sprinkle [it] in the code, it starts to allow you to collect the data that’s instrumenting your application. What vendors then did was build whole ecosystems around it – to move it, to store it, to analyze it, to visualize it. The market already knew how to move data quickly, analyze it, and visualize it, but it was the agents that allowed this market to be built. With Open Telemetry [and other open-source standards], a lot of companies now have open-source agents to instrument code and generate [observability] data. People no longer want to buy this monolithic thing. They want to build a stack based on best of breed layers.”
PROGRESSIVE GAUGE
When it comes to the observability market today, Apache Pinot provides a classic scenario where assemble-your-own “best-of-breed” buffet design contrasts with “all-in-one” “integrated suite” software from a single vendor. Observability could be ripe for disruption – and focus on the area could further StarTree’s advances. But streaming data was and still is hard, so the company will have to fire on all cylinders. StarTree’s upcoming [May 14,2025] Real-Time Analytics Summit 2025 agenda provides a vivid window into what the company is set to accomplish. Check it out at: https://rtasummit.startree.ai/#Agenda
