Blog

Sep
30
Easily Connect to Any API Source From Matillion ELT

If you are like many ETL developers you’ve struggled with an easy way to source cloud services data via REST API. Although standards are in place for REST API web services protocols, it seems that every vendor has their own variation of them, creating new challenges for each new source. Matillion’s cloud ELT product has long featured an API profile creator that sources from JSON files and creates RSD (Real Simple Discovery, an XML format) scripts for use with API query components. The effectiveness of this approach, however, is only as good as the quality of JSON files provided by the vendor.


Now, with version 1.47, Matillion introduces much more simplified functionality for extracting data from any REST API source. Called “Create Your Own Connector”, it does just that . . . creates a reusable connector profile with only a GET endpoint URL and any applicable authentication setting and/or parameters. No JSON conversion necessary! You can now establish a consistent “library” of extract profiles for all your API cloud data sources.


Here’s how it works: you create an extract “profile” by specifying a name and providing an endpoint URL:



temp-post-image

In this example, historical SpaceX data (which is publicly available) is the link being used. By hitting “Send”, the data immediately appears in the Response tab in its native format (usually JSON or XML). You can then specify any parameters in the Params tab (for example, you may want to limit the SpaceX output to only a specific ID or start/end date). If any authentication is needed, that can be set up in the Auth tab as either a UID/PW combo or an API Key.



Once the configuration is in place, you can then view the fields in a tree structure and specify any paging schemas if needed. And that’s it! You now have a working endpoint to your API source and can use that source in an API Extract component within a Matillion orchestration job.


temp-post-image


If you use Matillion with Snowflake this data can be staged to a table with the VARIANT data type, then transformed into structured data using the Extract Nested Data component. You can now integrate the structured version of your API data with any other structured data as part of a presentation data layer.


As more and more enterprise data sources move to the cloud, the need to easily and readily extract that data is growing. We are moving past the era of on-prem databases and traditional SQL queries as source elements for ETL. Matillion’s “Create Your Own Connector” addresses this trend by providing a simple and uniform way to source cloud data from virtually any vendor that utilizes REST API web services. Couple this with Snowflake as your target data platform and you have a complete end-to-end cloud solution for enterprise data management and analytics.


About the Author:


JOE CAPARULA is a Senior Consultant with Pandata Group who specializes in delivering data modeling and data integration services for clients across several industries.