Data Architecture, SAP Data Services, Agile data mart, ETL Development
Can ETL Be Agile?
Business intelligence projects benefit greatly from an agile development approach. Since BI closely aligns IT with business, an iterative delivery model ensures that business stakeholders are always involved in the design process and that a constant dialogue is maintained. The objectives and benefits of agile project management include:
Response to rapidly changing requirements
High degree of customer involvement
Quick results
Progress measurement
Team motivation
This approach has traditionally applied to the development of the presentation, or “customer-facing” layer of BI. But how does an agile project manager make “upstream” processes like data architecture and ETL part of the iterative deliverable? Much of the “data plumbing” remains invisible to the end-user, even in a self-service BI environment. Iterative delivery of the presentation layer can often be delayed as work is completed by data architects.
Part of this stems from the perception that the BI architect, as opposed to the business, is the “customer” of the ETL developers. This chain of dependency goes back even to the data modelers and architects, who may perceive the ETL developers as their customers. This linear approach to delivery does not fit with the business-centric agile model.
Instead, project managers directing the data architecture elements of an agile BI project should involve the customers as much with the upstream process as with the presentation layer. Much of this requires showing the value of the data transformation requirements as it translates into the analytic source. Elements of this customer-centric approach can include:
Involving source-system SMEs to provide system-of-record validation and quality checking
Creating dictionaries of business metadata to assist in logical model design
Testing ETL output for business validity and relevance
Creating recurring business rules to align data transformations with business standards and practices.
Data architecture and ETL can then be included as part of the full BI development sprint with user-defined deliverables at the end (assume 4 to 6 weeks for each sprint):
Data model ⇒ ETL ⇒ Data mart development ⇒ Semantic Layer ⇒ Reports and Analytics
Business users are then involved in each stage of the sprint, until the current sprint ends and the next set of deliverables are identified and stories are developed for the next iteration. This assures end-user involvement in the entire BI development lifecycle, not just the “customer-facing” product and the and of the chain.