The Grouparoo Blog
The 0.7 release of Grouparoo is a huge step forward for data engineers using Grouparoo to reliably sync a variety of types of data to operational tools.
Here are the key features of the release.
- Models enable Grouparoo to work with multiple data schemas at once.
- Grouparoo helps troubleshoot messy data and is more resistant to data problems
- New Destination: Braze Users
- DevOps Logging Plugins: AWS CloudWatch, Prometheus
Models
The primary addition is the concept of having multiple Models. Sources with Properties are added to a Model to define the shape of the Records that get created and synced to Destinations.
Here is how it works.
Previously, Grouparoo had the notion of "profiles" that were meant to represent people. Each profile mapped to a person in the system. This made sense because the most common kind of data that needed to be synced was that of customers, users, leads, contacts, and more. As developers have been using this setup, we have received feedback on how to iterate on the design.
First, we heard that not all "profiles" were the same. Doctors are different than patients. Leads are different than active users of the product. Therefore, the Record Properties of these different use cases should be different and be synced to different Destinations. This is now possible in the 0.7 release. You can create multiple Models that each have their own set of Sources, Properties, Groups, and Destinations.
We have also had many requests to sync Records that do not match the "profile" concept at all. We've heard about companies, accounts, events, locations, products, and a wide variety of custom schemas. In this release, you can create Models to represent anything and now we just have to create Destination plugins for the new data types. In future releases of Grouparoo, we will be enhancing our plugins to sync over these additional types of models.
A common use case that comes up is wanting to sync all of the Records to a Destination. Previously, one would have create a Group to enable this use case. Now with Models, we can be more sure that all of the Records are in the same dataset and give this option.
This change to Models is a huge step forward that will enable Grouparoo to be much more effective in our user's infrastructures. It is a big change, so be sure to check out our upgrade guide for how to get on board.
Invalid Data
Data quality is an ongoing battle in data engineering. In this release, Grouparoo continues to evolve to meet the challenge.
Grouparoo has always tried to enforce some consistency in the data by defining types for the Properties. For example, lifetime value (LTV) should be a "float" (number) value. The 0.7 release is more resilient to errors in the data as it being read from the Source. Previously it would cause an error during the import process, but now it will read in the invalid data and mark it as invalid.
Another interesting case is uniqueness. In Grouparoo, you can mark some Properties as unique. This is common for customer email address, for example. Grouparoo will now also flag these Records as invalid if the Source data does not follow the rules.
With these Records marked and the new option to browse them, Grouparoo can help troubleshoot data quality issues. When they are fixed upstream, Grouparoo will automatically read the changes and unflag the Records.
Plugins
Braze is a marketing automation tool for email and mobile communications. Grouparoo nows can sync its data to your Braze User Profiles to enable a better experience.
AWS CloudWatch enables viewing of log files in your AWS environment. The new Grouparoo plugin makes its logs available there.
Prometheus is a popular open source monitoring solution. Using this plugin, you can surface data from the running Grouparoo instances in that ecosystem.
Tagged in Product Engineering
See all of Brian Leonard's posts.
Brian is the CEO and co-founder of Grouparoo, an open source data framework that easily connects your data to business tools. Brian is a leader and technologist who enjoys hanging out with his family, traveling, learning new things, and building software that makes people's lives easier.
Learn more about Brian @ https://www.linkedin.com/in/brianl429
Get Started with Grouparoo
Start syncing your data with Grouparoo Cloud
Start Free TrialOr download and try our open source Community edition.