The two leading ETL/ELT tools for cloud data migration are Talend and Matillion, and both are well-positioned for moving and transforming data into the modern data warehouse. So if you’re moving to any type of cloud-hosted DW, whether it is a cloud-dedicated warehouse such as Snowflake, or part of a larger cloud platform such as AWS Redshift, Azure SQL Data Warehouse or Google BigQuery, which tool should you use to move your existing on-prem data?
Both Talend and Matillion can source any kind of on-prem data and land it in a cloud-hosted data environment. They can also move data to and from AWS’s cloud data-storage S3 as well as Azure’s Blob storage (which can be used to stage data for any cloud-based DW). From a deployment perspective, the biggest difference between the two products is that Matillion is a pure cloud-hosted data integration tool, where Talend is a multi-faceted suite of data management products, some of which (such as data integration) are deployed locally as full-client applications. Matillion is deployed on an existing AWS or Azure instance (and is available in their respective marketplaces) as a stand-alone Linux VM or Amazon Machine Image.
Talend’s core application is a design studio that comes in several flavors; in addition to Data Integration, Talend provides a Data Quality and Master Data Management version as well. This, coupled with cloud offerings such as Data Preparation (on-the-fly data cleanup), Data Stewardship and the recently-unveiled Data Pipeline (a result of their acquisition of Stitch), makes Talend a more well-rounded, albeit pricier, data management suite that is positioned for any type of ETL (on-prem, cloud or hybrid).
Matillion, on the other hand, focuses solely on data movement and ETL/ELT, and it optimized specifically for Snowflake, AWS Redshift, and Google Big Query (with separate marketplace applications for each). Its pricing model is similar to most cloud-hosted solutions in that it is “pay-as-you-go”. A nice feature that distinguishes it from Talend is that Matillion jobs can be scheduled directly within the ETL interface . . . Talend requires separate scheduling of its full-client data integration tool via a cloud-based administration console as well as management of job server configurations and loads. Matillion is a “serverless” solution that is simple to implement and use. It can even be used by non-ETL architects (such as data analysts) to quickly and easily land cloud data for instant analysis.
In the end, determining which tool is right for your organization comes down to scope and user base. A mid- to large-size company with a multitudinous data landscape may benefit from Talend’s wider footprint and data governance capabilities, particularly if there is an extensive user base of ETL architects and data stewards. On the other hand, agile organizations looking for a pure cloud play and simplicity of deployment and implementation should look to Matillion to quickly ramp up the population of their cloud data repositories. Both tools are strong players in the ever-evolving modern data warehouse space, and you can’t go wrong with either choice.