Experts detail common challenges that IT teams encounter when deploying and managing real-time data streaming platforms and offer advice on how to address them.
Technology and architecture can provide the right foundation for real-time data streaming success, but — as with all strategic IT initiatives — managing the required change in mindset is the ultimate determining factor between a promising idea and a successful result.
That’s according to Gordon Van Huizen, vice president of platform strategy at low-code development platform Mendix, who said changing organizational mindset is no easy feat. There’s often a tendency to think of data and event streaming as an augmentation of — or an add-on for — an existing paradigm. This thinking can lead to significant issues that will limit the applicability and long-term value of streaming initiatives and the platforms assembled to support them.
Real-time data streaming systems can also introduce a variety of new bottlenecks that create both technical and process limitations. Problems caused by the greater complexity of these systems, which can lead to failure when seemingly innocuous components or processes become slow or stall, are of particular note. Here are the top real-time data streaming roadblocks that organizations are facing today — and tips for overcoming them.
Reliance on centralized storage and compute clusters.
Modern real-time data streaming technologies — such as Apache Kafka — are designed to support distributed processing and minimize coupling between producers and consumers. Binding their deployment too tightly to a centralized cluster — such as when deploying on a classic Hadoop stack — will stifle project and domain autonomy. This will, in turn, limit streaming adoption and data use.
“Instead, consider a distributed deployment architecture through containerization to enable greater flexibility and domain independence,” Van Huizen said.
Rigid processing pipelines
While the computing concepts behind event-driven architectures and data streaming are designed to promote loose coupling, stream processing stacks themselves can be too tightly bound, Van Huizen said.
Stream processing systems need to be open and flexible, allowing organizations to compose solutions out of heterogeneous processing services. IT teams should carefully consider how they design and implement processing pipelines. They should consider employing a pipeline abstraction framework, such as Apache Beam, to allow projects to use capabilities from a broader ecosystem of real-time data streaming technologies, according to Van Huizen.
Establishing a technical foundation for deployment and processing flexibility can also lead to management challenges.
“As we’ve learned the hard way from enterprise service bus and data lake initiatives, technology built for distributed and federated deployment is still often employed in a centralized and monolithic way by enterprises,” Van Huizen said.
Technologies that were meant to create autonomy and drive broad-scale use have created the largest enterprise IT monoliths to date, he added. Embracing domain-driven architecture is the key to ensuring that what is meant to be a common infrastructure doesn’t turn into a centralized monolith. This requires shifting from a push and ingest model common with extract, transform and load and event streams to serving a pull model across domains, Van Huizen said.
As the data gets larger, operations naturally become a bigger issue. For starters, backups take longer and eat up a bunch of resources. Rebuilding indexes, defragmenting storage and reorganizing data are all time- and resource-consuming operations, according to George Radecke, solutions lead at IT consultancy Saggezza.
“If you’re running live 24/7, you need to plan for the extra resources to do all of the operations without failing to meet your service-level agreements,” Radecke said.
Sizing and scaling everything is a common real-time data streaming issue. Even big, experienced organizations undersize, Radecke added.
“When I get called in to solve performance issues, there are often design and implementation issues,” he said. “But equally often, the systems don’t have sufficient computing resources.”
Radecke recommended IT teams document expected service levels and service levels required of others. He also encouraged IT teams to test in an environment that is sized to at least one-quarter of the production environment. If they drive one-quarter of the expected load against one-quarter of the data, they can identify and resolve any issues long before deploying to production.
Controlling network experience
Controlling the network experience for a client can be challenging. This is an even bigger challenge in the cloud, said Rob Doucette, vice president of product management at Martello Technologies, a network performance management vendor.
An on-premises hardware component can help to optimize viewing and provide a better user experience should an issue arise. Fault and performance management software can monitor real-time data streaming application delivery and take immediate actions to improve performance. Good metrics to track include jitter, packet loss and latency. Events in the IT infrastructure typically cause these issues, according to Doucette.
“Knowing which problems are occurring concurrently can help you identify the source of the issue faster,” Doucette said.
The deluge of new analytics use cases and data sources also poses a problem for enterprises. After a successful analytics project, business managers from across the enterprise may put forth a variety of different types of requests. But acquiring and preparing data with a data lake on premises or in the cloud can prove more challenging than expected as the scale of these activities has grown exponentially, said Buno Pati, CEO of Infoworks, a data operations and orchestration platform.
The issue is largely rooted in the use of legacy methodologies and tooling that require a persistently growing team of skilled data engineers and developers to ingest and synchronize data and create the analytics pipelines needed to deliver data to applications, Pati said.
Business integration hiccups
“One of the most common issues we see is that the enterprise as a whole is comprised of many lines of business and application teams, each focused inwardly on their mission and challenges,” said Jonathan Schabowsky, senior architect for the office of the CTO at Solace, an event messaging platform.
This works for a period of time, until each group needs to integrate and exchange real-time data streams of events across the enterprise. This can result in multiple integration points in an attempt to federate said events, according to Schabowsky.
Investing in a single platform
Apache Kafka is sometimes confused as an event or message broker when at its core it’s really a NoSQL database that works to persist events for as long as the business requires, Schabowsky said. Although this may seem like a minor nuance, it can be a critical detail given that real-time movement and federation of events across the enterprise require very different tooling than the storage of an immutable event log.
Thus, IT teams should really understand where to use each platform to solve their unique use cases, and they must consider that event streaming across lines of business is where significant business value typically lies.
“Avoid buying into the hype that there is a single streaming platform that solves every use case you can conceive,” Schabowsky said.
Within the streaming paradigm, separate out event movement and federation from long-term persistence and event sourcing use cases. This actually simplifies the problem of real-time data event streaming, thus enabling the organization to quickly begin to recognize the ROI of event streaming by getting some quick wins, Schabowsky said.
IT and OT disconnect
IT teams and operational technology (OT) teams typically operate in different worlds. But many types of real-time data streams, particularly those originating from IoT devices, have different characteristics than what IT teams may be familiar with. This data can run at different frequencies and produce a wide variety of data sets in just as many formats, including audio, video, heat and vibration, said Ramya Ravichandar, vice president of product management at FogHorn Systems, an edge computing platform.
To drive successful real-time data streaming deployments for these use cases, IT and OT teams must work together to connect the value of the business use case to the application. OT staff members have a deep domain knowledge, which can range from knowing what sounds a failing machine makes to how various machine functions correlate to one another, according to Ravichandar.
“This expertise is the missing link to provide context and clarity to the vast amount of data points produced by machines and sensors,” Ravichandar said.
New development paradigm
Real-time data streaming involves not just a new infrastructure, but a new development paradigm as well, said Karthik Ramasamy, co-founder and CEO of Streamlio, a real-time processing platform. This not only affects infrastructure, but also data engineers and developers, making it more complex than just swapping out a layer of infrastructure.
“To reduce that challenge, it is important that organizations avoid choosing a large rip-and-replace project as their first real-time streaming use case,” Ramasamy said. Instead, he said to choose a smaller project so that existing infrastructure and programs do not need to be refactored and replaced. In addition, he suggested IT teams look for technologies that use simpler and more familiar interfaces and paradigms rather than new and unfamiliar ones.
Difficulty predicting processing time
“More often than not, the input data does not exactly meet our expectations, and it can be hard to predict how aggregating it will behave,” said Liran Haimovitch, co-founder and CTO of Rookout, a debugging tools vendor.
Unlike batch processing, it’s not practical for IT teams to just rerun the job until they find a bug in the processing pipeline. Because it’s streaming in real time, any bugs encountered in data processing will generate incorrect results.
The distributed and ephemeral state of streaming data processing platforms also hinders standard debugging techniques. Testing won’t catch every type of unexpected data, but new, more modern debugging and observability techniques and tools can help mitigate the issue.
Martello provides the only end-to-end Microsoft Teams performance monitoring tool that Microsoft recommends to their customers to maximize employee productivity.
Our solution Vantage DX proactively monitors Microsoft 365 and Teams service quality, enabling IT with complete visibility of the user experience to ease troubleshooting of issues before they impact users.
Find out why Martello is Microsoft’s go-to-solution for Microsoft Office 365 Monitoring >>