Monday, August 13, 2012

Enterprise Data Architecture, SOA and Data Services



Decades of R&D and growth around EAI, ETL, MDM and SOA has led us to one conclusion alone – data matters. It has always been about data. I guess it’s no secret either that content is the king of consumer web and data is the king of enterprise software. 

Interestingly enterprise data requirements are fundamentally too complex and too closely driven by the high-volume, mutli-dimensional nature of BI systems to entirely be serviced from a messaging layer alone. Unfortunately, the data universe of DW, BI and MDM still continues to live outside the Enterprise SOA Vision/Road-map which significantly dents the overall ROI for IT/SOA expenditure.

Solution Architects (most often under corporate obligations) often fail to notice that a significant chunk of the top-line business value of SOA resides in the data that SOA suite pushes between applications/service components. SOA can efficiently orchestrate ETL/ELT module to process bulk data, batch jobs or file transfer in near real-time. SOA stack can also integrate the BI modules to embed BI metrics within BPEL workflows, generate business events from BI alerts, embed analytics in business rules and hence helps take smart real-time decisions.


SOA plays a significant role in MDM and Information Architecture roadmap for any big/small enterprise. Essentially within the walls of Information Architecture, SOA tier plays a pivotal role for Data Federation encompassing data massaging, validations, cleansing, consolidation etc. However the inherent inefficiencies of XML, associated with large data handling, and the fact that almost all SOA suites are built upon Java containers – it necessitates the use of a highly optimized caching server within the MW tier to Federate IA using SOA. This added setup/maintenance cost of Caching server basically veers us to utilize SOA Data Service alongside Data Federation Tier.

SOA Data Services are not just services operating on XML. Instead SOA Data Services are enterprise data management end-points that expose highly optimized engines for working on all types of data. Data service as a concept predates SOA, in fact it dates back to time when B2B ecommerce and EDI were gaining momentum in the business space in early 1970s. 

Technically, a data service should exhibit 2 or more of following attributes – 
  • Contract based routing, 
  • Declarative API, 
  • Data encapsulation, 
  • Data abstraction and 
  • Service metadata. 
A close look at above mentioned attributes could show the direct relation between Data Services and basic pillars of SOA. However we should never confuse SOA with unified data tier. 

The great value of latest, up-to-date and trusted information is more evident at present times than ever before. “The two second advantage”[REF-1] is indeed a game changer. Apparently I have worked closely on enterprise architectures where SOA was employed as means of integration between ETL, BI, MDM and Web Applications. By building the enterprise components strategically and then by creating elegant unified integration solutions using SOA, enterprises more likely to achieve the following objectives;
  • reduce the TCO for data (MDM), 
  • increase agility (hot plug-gable SOA), 
  • improve performance (real-time batch/DW), 
  • reduce risk (integrated human workflow) and 
  • improve business insight (BI)

Mature Data Integration Architecture is the foundation of strong EA. Success or failure of application integration and SOA composites is directly related to the maturity of data integration tier.

Enterprise level Data Services can be broadly grouped into following categories;

  • Master Data Services can form a trusted single point of reference during real-time or batch data movements. 
  • Batch Data Services can work closely with BPEL/ESB so that the point of control for invoking/running ETL remains inside the SOA tier, just like classic delegation design pattern providing architectural loose coupling. 
  • Data Quality Services are typically recommended to be used inline with other data services, else statically directly with data source. These data services use pre-defined rules and algorithms to clean, reformat, de-dupe the data. 
  • Data Transformation Services are classic data services used primarily for swapping formats of data to meet the downstream system requirements. These data services require a very critical view at the ESB and ETL tier within the EA landscape to make appropriate and architecturally efficient choice. 
  • Finally the Event Services are driven primarily by the EDA/CEP and these data services track, co-relate and propagate data on certain pre-defined triggers with the MW.
In the end I would like to recommend the approach/pointers for prospective adopters of SOA based Data Services;
  1. Start small – don’t go big-bang
  2. Use XML judiciously – explore other means
  3. Evaluate trade-offs diligently for any conflicting solutions/approaches
  4. Don’t be a victim of ‘The Accidental Architecture’
  5. Devil is in the details – take a deep dive
Irrespective of the visible and obvious integration points in existing EA landscape, one must understand that the true value for architecture tier resides in flexibility of the core infrastructure to be reconfigurable with minimal resource overhead. This re-configurability is a central characteristic of a Data Services approach, and the foundation of a successful long-term strategy for Enterprise Data Architecture. 



References:
REF-1: The “Two second advantage” is concept coined by founder of TIBCO Vivek Ranadive'.

Sunday, July 29, 2012

ESB Deployment Patterns - Theory in Practice



Modern Enterprise Service Bus (ESB) is inherited from classic computing concept of Information Bus from 80s and ESB was glamorized in late 90’s early 2000s by technology giants TIBCO, Oracle, IBM and SAG. In last decade a lot of open source players have also forayed in the mainstream competition and produced some amazing ESB products, especially Mule, Fuse, WSO2 and Red Hat. This disruption in ESB market has only helped the end consumers and enterprises alike.

The popularity of SOA and Cloud solutions has also helped ESB penetrate deeper into the enterprise architecture landscape. Not only does ESB enhance SOA and Cloud experience but it also helps address the security related concerns to a great extent, for entire enterprise/LOB by working in a “Brokered Authentication” pattern and becoming single point of authentication & authorization. Though the mechanism varies vastly between offerings and also between enterprises, the underlying principle remains the same.

However the advancements and refinement in the ESB products in last couple of decades do not necessarily guarantee higher ROI and lower TCO for the enterprises that chose to use ESB solutions. At the end of the day it all depends on how one deploys the ESB within the EA landscape and how the EA lays down the road-map for ESB within IT organization. Please remember that ESB is just a mean to more robust EA, just an enabler.

In next sections I share my own experiences in ESB deployment topology and patterns to help the reader evaluate and compare the possible solutions for his/her use case(s). I suggest the reader to be very critical during project requirement evaluation phase and do not hesitate to use a combination of 2 or more deployment topology discussed below.

Global ESB Pattern:

  • Single namespace and canonical model for all service components interactions.
  • Easy to maintain, scale and govern
  • Easy to implement in SME.
Global ESB Deployment Pattern
  • Most suitable pattern for heterogeneous, centrally administered and globally centralized EA landscape.
  • The large number of integration in large enterprise makes this pattern a less favorable option since a single canonical model would be difficult to conceive and manage.
  • Ideal approach for this pattern would be bottom-up, however in case you already have a considerable EA landscape and ESB is new entrant in otherwise mature EA a combination of bottom-up and top-down approach, i.e. meet in the middle, should be considered.

Directly Connected ESB Pattern:

  • One of the most popular and successful deployment pattern for ESB
  • It provides means to ‘global decentralization’ and ‘local centralization’ of service components.
  • Most suitable for large organizations with geographically distributed EA landscape and/or multiple LOBs.
  • This is particularly popular and effective in cloud implementations involving hybrid deployment strategy. The hybrid cloud model would be represented by an ESB on premise and the other directly connected ESB on cloud (NOTE: Please consider the complexity related to PII and PCI compliance when working with Directly Connected ESBs in hybrid cloud model).
Directly Connected ESB Deployment Pattern
  • Common service registries could facilitate service discover-ability and also make the service governance manageable and overall architecture scalable.
  • The esb2esb communication protocol facilitates hiding the service component implementation details providing loose coupling and abstraction.
  • The esb2esb communication is associated with moderately high latency

Brokered ESB Pattern:

  • It is a direct representation of the classic HUB-SPOKE model
  • The broker selectively exposes service components to providers, consumer and partners.
  • This deployment patterns is best suited for large organizations with multiple trading partners, operations and/or platforms
  • The broker ESB regulates the communication and data/info exchange between multiple ESB installations on both sides, where each individual installation in managed and/or governed separately
Brokered ESB Deployment Pattern
  • Maintenance of service registries and repositories is complex and costly proposition.
  • Over a period of time, this pattern is most vulnerable to Spaghetti (hairball) anti-pattern.
  • Brokered pattern induces inherent latency in the business process and is NOT suitable for use case where latency is critical parameter/KPI.

Federated ESB Pattern:

  • It is the most frequently advised ESB pattern, however in my opinion it’s one of the most difficult to pull-off ESB patterns, owing to complexity associated with lateral components of governance.
  • It is good representative of a being referred to as a ‘global ESB pattern’ since it is applicable in multiple use cases, ranging from decentralized to centralized EA, from heterogeneous to homogeneous EA etc.
  • I like to refer to it as ‘Glorified one-to-many Broker ESB pattern’.
  • It is a good pattern for a use when you want to enforce integration governance from top-bottom and want to closely monitor/track interfaces, KPI and service components.
  • This pattern is a good candidate for an enterprise with multiple centrally managed LOBs
Federated ESB Deployment Pattern
  • Governance should be considered a mandatory requirement when implementing this pattern; else it could lead to ‘Blob’ and/or ‘Poltergeists’ application anti-pattern.
  • Infrastructure and capacity analysis should be done diligently to deal with the service throughput and latency related requirements (and other NFR). Inefficiently structured federated ESB tier could lead to Stovepipe systems, which could choke the federated tier.


In my professional experiences I have come to believe that every organization has unique integration scenarios, situations, challenges and requirements – requiring very pragmatic and open view towards each. I maintain that ESB (for that matter any other COTS) is not the end, but merely one of many possible means to reach the end.