InfoQ

The Software Architects' Newsletter
March 2023
View in browser

Welcome to the InfoQ Software Architects' Newsletter! Each month, we bring you essential news and experience from industry peers on emerging patterns and technologies.

This month, we focus on "Modern Data Processing: Data Pipelines, Streaming, and Data Mesh". These core topics currently span the entire "diffusion of innovation" graph in our 2022 AI, ML, and Data Engineering InfoQ Trends Report. We see increasing adoption of stream processing, distributed computation, and "data lake-as-a-service".

Key challenges remain in this space, including being conscious about how data pipeline architecture decisions are made at scale (and with the required speed) and bringing the social and ethical elements into the sociotechnical systems in which we all now work.

News

Vineyard v0.13.2: Zero-Copy In-Memory Sharing of Large Distributed Data

Vineyard v0.13.2, a zero-copy and in-memory data manager, has been released. This version includes improved features for Python/C++ development and Kubernetes deployment. It is maintained as a CNCF sandbox project and provides distributed operators for sharing immutable data within or across cluster nodes. V6d is particularly interesting for in-depth network training (e.g., large language and graph models) on big (sharded) datasets. Its development is led by an Alibaba engineering team.

High-Performance Computing for Researchers and Students with Amazon Lightsail for Research

AWS recently announced the general availability (GA) of Amazon Lightsail for Research, a new offering designed to enable researchers and students to easily create and manage high-performance CPU or GPU research computers on the cloud.

Amazon Lightsail for Research is a solution for students and researchers who may not have access to dedicated computing resources or the technical expertise required to set up and manage a high-performance research environment. This new offering aims to provide a cost-effective way to leverage the power of the cloud for research and experimentation.

The Wonders of PostgreSQL Logical Decoding Messages

In a recent InfoQ article, Gunnar Morling, Senior Staff Software Engineer at Decodable, explores how PostgreSQL can emit messages into its write-ahead log (WAL) without updating actual tables. Logical decoding messages can be read using change data capture tools like Debezium. Stream processing tools like Apache Flink can be used to process these messages, e.g., enrich, transform, and route.

There are several use cases for logical decoding messages, including providing audit metadata, application logging, and microservices data exchange. However, there is no fixed schema for logical decoding messages; the application developer must define, communicate, and evolve the schema.

Integrating Azure Database for MySQL Flexible Server with Power Platform and Logic Apps

Microsoft recently announced a new set of integrations with Azure Database for MySQL Flexible Server and the Microsoft Power Platform and Azure, making it easier to develop solutions for analyzing data, automating processes, and building apps. These new integrations include PowerBI, Logic Apps, PowerApps, and Power Automate.

Azure Database for MySQL Flexible Server is a deployment mode generally available since November 2021 that provides more control and flexibility over database management functions and configuration settings than the Single Server mode. Users can use it as a managed service to run, manage, and scale highly available MySQL servers in the cloud. It supports MySQL 5.7 and 8.0 versions.

 

Case Study

Design Pattern Proposal for Autoscaling Stateful Systems

As best practices for modern software engineering of distributed systems evolved around the principles of segregation and the ever-growing need for scalability, a common challenge arose where autoscaling stateful systems (i.e., databases) became complex, and, at times, unfeasible. That has led to many companies choosing to over-provision such systems so that, based on expected loads, the systems can cope with the highest expected demands.

This, of course, brings problems. Over-provisioning resources are costly. It also does not guarantee reliability. Sudden demand surges or a DOS attack can easily compromise the expected loads. This article aims to dig deeper into the challenges faced when attempting to auto-scale stateful systems and proposes an opinionated design solution to address many of those challenges through a mix of existing and novel approaches.

Autoscaling stateless systems is a well-understood field at this point. Yet, autoscaling stateful systems suffer from a much shorter amount of common standards and practices, especially in the public domain.

Synchronizing data on new nodes is a big challenge when scaling up a stateful system. It takes time and resources from the cluster, and the complexity only increases as the size of the data increases.

Consensus algorithms like Raft are commonly used in stateful systems to select a new leader when the previous one crashes but are rarely smart enough to understand which node would deliver the best performance to the cluster.

There are differences between scaling reads and writes in a stateful system that prevents the adoption of a single strategy for all; adopting special strategies for each scenario improves the chances of successful automation.

Understanding when scaling or descaling the system should happen is vital, as there are substantial differences between scaling up a system that is slowly degrading in performance versus defending a system under a sudden and temporary significant increase in demand, such as when under a DOS attack.

This content is an excerpt from a recent InfoQ article by Rogerio Robetti, "Design Pattern Proposal for Autoscaling Stateful Systems".

To get notifications when InfoQ publishes content on these topics, follow "AI, ML, and Data Engineering", "Streaming", and "Data mesh" on InfoQ.

Missed a newsletter? You can find all of the previous issues on InfoQ.

Sponsored

D2iQ

While microservices avoid many of the limitations of monolithic applications, they come with their own set of challenges, which include managing multiple services on multiple nodes, effectively scaling your application, and zero-downtime deployment of new versions of your services. Kubernetes was developed as a solution to these challenges. In this technical guide, learn 10 best practices you can use to effectively manage your microservices on Kubernetes.

Learn more about this topic in the technical guide "The 10 Best Practices for Microservices on Kubernetes", sponsored by D2iQ

Upcoming events

QCon: For practitioners, by practitioners


QCon: Early bird prices end April 19. Book today to save.

At QCon, you'll find many ways your team can connect over new ideas, learn from peers, and inspire how to solve your challenges.
Attend in-person or join with our on-demand pass.

Book your team tickets before April 19 and save with limited early bird tickets.

QCon New York - June 13-15.

QCon San Francisco - October 2-6.

We're hiring! InfoQ Editor-in-chief (full-time, remote)

Introduce the world's top software teams to early adopter technologies.


InfoQ seeks a full-time Editor-in-Chief to join C4Media's international, always remote team. Join us to cover the most innovative technologies of our time, collaborate with the world's brightest software practitioners, and help more than 1.6 million dev teams adopt new technologies and practices that push the boundaries of what software and teams can deliver!

If you reside in Europe, Africa, or South America in time zones between GMT-3 and GMT+2 and are passionate about creating technical content that will make a difference to the world's senior software teams, we’d love to hear from you!

Apply for the InfoQ Editor-in-Chief role

Senior software developers rely on the InfoQ community to keep ahead of the adoption curve. One of the main reasons software architects and engineers tell us they keep coming back to InfoQ is because they trust the information provided and selected by their peers.

We’ve been helping software development teams adopt new technologies and practices for over 15 years through InfoQ articles, news items, podcasts, tech talks, trends reports, and QCon software development conferences.

We hope you find this newsletter useful. If not, you can unsubscribe using the link below.

Unsubscribe

Forwarded email? Subscribe and get your own copy.

Subscribe