The following summarizes how I rescued a data warehouse project that was on track for a disastrous failure. The Purpose? To provide valuable guidance to assist Project Managers on existing and future Data Warehouse projects, and to ensure that the same mistakes don’t happen twice.
Those who cannot learn from history are doomed to repeat it. - George Santayana
Lessons learned
This assessment summarizes the lessons learned from joining a project already millions over-budget and years late. The identified insights from this project have been segregated into three categories:
A. Actions immediately to rectify the project
B. General project management lessons learnt
C. Data Warehouse specific lessons
Actions immediately to rectify the project
- High level view
Single page dashboard of sub-projects and their status and dates.
Single page dependency plan created to highlight areas of critical path and cross organizational dependencies
A single composite project plan was constructed for the first time. - Coherent Communication
A communications plan was created and published that defined roles, responsibilities, escalation and communication process. - Active project management and Coordination
- Daily meetings (internal and with vendor)
- Monthly meeting in person with vendor - Phased approach
An achievable first phase that provided maximum value within an aggressive timeframe was scoped, and universally agreed by Executive Management, Marketing and Business leaders. - Tracking
Migration from a flimsy unstructured “HotList” to a formal Issue Log - Risk Management
A comprehensive risk assessment and mitigation plan was created and published to look ahead to plan for and manage contingencies. - Vendor Project Plan
Demand for vendor to redo their project plan to:
- Correct weaknesses
- Archive of completed phases/tasks
- Rewrite to map precisely to new Program Dashboard - Specs shared
Amazingly vendor specs were hidden from the client.
Acquired specs from vendor enabling review, feedback, and scheduled fixes - Synchronizing plans
A single consolidated project plan did not exist. A process was put in place to frequently synchronize vendor plans for accuracy and to manage critical path and dependencies. - Milestone and deliverable management
Compelled vendor to add all short term project deliverables to the Issues Log for single location tracking. - Resources
A Business Analyst (BA) was brought on board and dedicated for duration of project
PM resource and role changes - Vision document
A vision document was created to define the goals of the remainder of the project in a phased approach. - Vendor relationship management
Face-to-face visits with vendor were increased in both locations, and frank discussions uncovered and addressed a range of inefficiencies, concerns and confusion.
General Project Management
IT involvement
Belated IT involvement in vendor selection, management requirements and specifications resulted in a selection without IT’s capability assessment, poor coordination, and lack of project support within IT.
QA (including requirements, quality standards as well as IT QA role) were not defined.Unlike almost all other ongoing IT projects, there was no QA involvement throughout the project. A cross-functional team of IT and business QA should have been put in place early in the project to guide the quality requirements, define the test criteria, and complete testing.
Phased project approach
This project was conceived, planned and managed as a single scheduled deliverable to the business. No phases, no interim deliverables. A “phased” approach is useful because:
- A phased approach allows early delivery of highest value components to the business
- It allows team members to break the project down into more manageable segments.
- Each phase can be brought to some sense of closure as the next phase begins.
- Phases can be made to result in discrete products or accomplishments to provide the starting point for the next phase.
- Phase transitions are ideal times to update planning baselines, to conduct high level management reviews, and to evaluate project costs and prospects
- Each phase can be a “gate” for evaluation for engaging in the subsequent phase, based on actual costs, delivered capabilities and realized value
- A Proof of Concept (POC) is an even more cautious approach to demonstrating vendor, technology and architecture viability.
Scope definition
- Project scope was inadequately defined
- Weak and incomplete requirements
- Defined criteria for success
- Moving target syndrome
Also known as “Scope Creep”
Quality and thoroughness of testing are examples of evolving targets. Specification review was done in the final months of the project, leading to further delays and scheduling of development fixes as a result.
Estimation
- The project was not realistically estimated with input from all involved parties.
- There was no robust estimation of the full project cost.
Roles and responsibilities
Roles and responsibilities for vendor and client were not clearly defined, creating crossed wires, confusion and tension. Examples of fundamental roles not clearly established until the end of the project include:
- Precisely who assesses reports, decides and approves data quality.
- Who makes decisions, across the range of areas; leading to vendor frustration
Custom work done instead of purchasing an available vendor product
A significant amount of custom coding was done to avoid the cost of product licenses. While licenses can be expensive, custom development carries huge risk and delays, as this project experienced.
Availability
- Design vs. Requirements
Data freshness and system availability were not clearly defined as requirements, and final design specifications were not checked against business needs. - SLA
An overall IT Service Level Agreement, and sub-agreements for each vendor component did not exist.
Vendor Management
- Insufficient vendor management was not in place.
- Vendor selection was not done through a formal RFI/RFP process with multiple vendor response, assessment, capability, proposal and pricing analysis. Effectively, a vendor was chosen on business user whim, without consideration of capabilities, or how the vendor would be managed.
- The Vendor was allowed to manage to their own SOW/Internals, and not to the project deliverables
- The vendor contract did not outline deliverables clearly
- The vendor contract specified a long term lock-in, constraining the ability of the client to manage the project and vendor.
- Structure of coordination, design, operational management and escalation responsibilities between vendors was not clearly thought through
- No single point of internal contact for vendors resulted in delayed and ultimately wasted vendor effort and frustration in seeking guidance and clarification.
- Vendor permitted not to share specifications with the client, preventing client IT from identifying and correcting mistakes before coding.
- Vendor was not shown the larger enterprise view of the overarching project
- Prioritization and Resource Planning
Vendor did not publish dedicated resource assignments, for PM and business to guide sequence and priority.
Resource Management
a. No visibility into resource constraints and utilization
b. Client resources were committed to multiple concurrent projects without a means for scheduling or prioritizing efforts, or managing and escalating conflicting resourcing needs.
c. IT Team members had dramatically different perceptions of project priority.
d. The following IT resource should have been dedicated to this project:
• Data Architect
• Business Analyst
• Project Manager
Unstructured and chaotic Communication
a. All communications were done via email
b. All status, questions and updates were sent to all staff and executives
c. There was no formal structure or method to communication or escalation
d. No single point of communication within the vendor for communication as well as coordination and escalation. Effectively everyone at both client and vendor were spammed with a stream of details, drowning out useful communication and information.
Decision Making
- Decisions were made without IT involvement
- IT concerns and recommendations were overridden without sufficient consideration
Architecture
Much like Requirements creation, Architecture should fundamentally remain within the client organization with any outsourcing focused on analytics, development and service. This would ensure projects conform to long term architectural needs, and would retain the intellectual capital in-house.
Data Architecture
There was no overarching Data Architecture defined up-front, that encompassed the full set of applications, data sources, staging areas and repositories. A Data Architecture describes the data entities as well as how the data is processed, stored, and utilized. There were seven data architectures that were integrated ad-hoc; which included canned applications.
Project Planning:
a. Multiple separate and partial project plans were used, rather than a single project plan, which prevented critical path management, adequate resource scheduling, nor delivery date visibility.
b. Project plan did not provide a roll-up of detail into summary tasks and milestones
c. Resources were over-allocated without review or revision
d. No critical path and dependency analysis and optimization
e. No clear definition of deliverables
f. The schedule was not created based on the project plan, and hence lacked realism and lost credibility over time
g. Detailed Project plan not aligned with business-oriented deliverables
h. Vendor project plan was not fully integrated into the client project plan
i. Vendor was not required to structure the project plan to fit business objectives
j. Task/project dependencies were not clear
k. The Project was inadequately defined, remaining requirements left to be defined later.
l. Inconsistent project phases, tasks, level of detail (who, what, how).
m. High level mission and objectives not defined and communicated
Project Management:
- Passive project management resulted in schedule drift
- Active project management techniques were not utilized
- Status meetings
i. Meetings were weekly, instead of daily
ii. Vendor was allowed to set the agenda and manage the meetings
iii. Meetings were informal, no agenda, minutes or follow up - Tracking and managing assignments and issues were informal
- Existing culture did not value, commit or manage to task/dates
- Costs were not calculated or communicated
- Schedule updates were not frequently calculated and communicated
- The project schedule was not communicated, made visible, or widely believed
- Dates repeatedly allowed to slip
- Enhancement list growth impacted schedule through scope creep, and not managed against original mission or separated into planned phases.
- Inadequate Vendor Effort Oversight
Vendor managed the plan, yet the majority of effort was largely unmanaged at the client site
Data Warehouse Specific
SLA
SLA for data freshness, performance, availability, catch-up, load times
Set of data elements
The minimum set of needed Data Elements should have been carefully selected in advance, instead of loading all data into the data warehouse. Loading the full set slowed analysis and design as well as the initial and daily extract, transmit and load times.
Tools
Initial lack of tools for ETL, resulted in inefficient use of an external vendor for extracts.Tools may be expense, but manual (Handraulic) effort is more expensive.
Data descriptions and definitions
The in-depth meaning of data elements should have been defined, documented, and shared to confirm with vendor that the meaning of the elements is clearly defined and understood.
Location of data cleansing
Where data is cleansed should have been defined; whether it is at the point of extract, load or in the Data Warehouse itself.
Level of data cleaning
The required level of data cleansing should have been defined in advance.
Resource
Data architect not assigned or available. Project Manager was part-time. For a DW project, having a Data Architect, and a data model early on is key to success.
Environments
Only a single DW environment was planned for. This prevented concurrent testing/production. It also prevented concurrent used of the production environment while loads to a separate environment are conducted.
Data model
The overall data model was not documented in advanceData Analysis into definition, valid values, transformation, exception handling and testing was revisited after project was largely completed
Data history
Ability to understand how data was populated into the data mart, when the source data was delivered and whether or not the data was loaded manually or manipulated outside of the normal loading process.
Specifications
Vendor did not share specifications for review
Auditing
Insufficient auditing of the loads, provided little early insight into problems, which were left to be discovered late which drove up costs via rework.
Design best practices:
Traceability
Inability to trace data back to source
Counters in code should be avoidedCounters in code, rather than actual records, prevent accurate reconciliation and tracing of data in the data warehouse.
Transaction management
No ability to reconcile partial data writes with roll-forward or roll-backward for transactional integrity.
Limited consistency checks
On data being loaded into DW. Robust and comprehensive consistency checks are recommended.
Insufficient test scripts
Daily load test was determined to be entirely inadequate for business objectives.
Primary keys not generated from sequence generators
Primary key created by incrementing the previous maximum value by one. This is known to be error-prone and likely to generate duplicate keys where uniqueness is required.
Rejected data remediation
No designed ability to review/correct rejected data.Exceptional conditions not captured or reported
Error Messages
Lack of clear definition for logging, wording and communicating errors in a consistent fashion
Error Handling
Lack of clear definition for handling errors and exceptions.
Wrap-up
Data Warehouse projects present a unique set of challenges. This large project was rescued from failure with quick focused action that represents some best practices that can be of use for other projects. It is my intent that this can provide some guidance to assist Project Managers on future and existing Data Warehouse projects, and to ensure that the same mistakes don’t happen twice. Heck, the same mistakes shouldn’t happen once ;-)