Cloud computing is having an increasingly large influence over the IT landscape. It’s likely that, whether you realize it or not, corporate data exists has and or is migrating outside the walls of your organization. Recent research by Ventana Research shows that in areas such as customer services, sales, workforce or human capital management, software as a service (SaaS) or cloud-based applications increasingly are being accepted and adopted. In our benchmark research on business intelligence and performance management, for example, only 53 percent of prefer their systems on-premises, and we expect that percentage to decline in the next 12 to 24 months, in which more than one-third of organizations plan to begin using cloud-based or SaaS applications.
However, cloud-based applications and services raise information management challenges that don’t necessarily exist in on-premises deployments. The new silos of applications and software that enable doing business “in the cloud” also are new corporate data repositories that must be integrated with other enterprise data and must be managed as a whole. Among the many challenges lurking inside the cloud are data accessibility, data consistency, data integration, data quality and data governance.
In many cases the advocates and buyers for using cloud-based services are line-of-business managers who see such solutions as addressing their immediate concerns for rapid deployment with minimal capital outlays, but these business folks may not be aware of the data challenges associated with moving to the cloud. For instance, as more and more data resides in applications managed by third parties, how will the organization bring it all together for analysis, reporting and other necessary uses? Without a capable data integration infrastructure, will users be forced to cut and paste data from reports or export it to spreadsheets, encountering the issues of consistency and accuracy that practice raises?
At last month’s salesforce.com Dreamforce event that my colleague assessed, I spent time examining some of the data integration alternatives available for cloud-based data. Informatica has been investing significant resources in a cloud-based product and now boasts over 1,000 customers using its cloud-based services. At Dreamforce, Informatica added two new products to this portfolio. At $99 a month, Cloud Express is the lowest-priced offering that includes support. This pricing is usage-based and includes up to 300,000 rows of data movement per month. Cloud Express includes scheduling (not available in Informatica’s free product) but is limited to salesforce.com data and does not include application integration features. At the other end of the spectrum, for $6,000 per month, Informatica’s enterprise version offers integration with its PowerCenter product and provides an environment for hybrid integration of cloud and on-premises data. Informatica now has five different cloud-based offerings and price points, which constitute a relatively complete product line.
Pervasive Software has also made a significant investment in cloud-based data integration services. Focused initially on small to midsize business opportunities and point-to-point integration, Pervasive has cloud-enabled its core product as Pervasive Data Integrator v10 Cloud Edition that my colleague assessed earlier in the year. With 250 customers in production in the cloud and four years of experience working there, Pervasive is a serious contender in the cloud data integration market. Its Data Rush technology provides highly parallel operations coupled with elasticity features that spread operations across multiple servers for better performance and throughput. Pervasive does not charge per connection with v10 in the cloud, which could be a significant differentiator for some organizations that need to connect to many different data sources. But there are some limitations to be aware of: Life-cycle management features and data lineage features are not fully supported in the cloud yet.
Cloud data integration (like data integration in general) goes beyond traditional structured data. An operating unit of Information Builders, iWay Software not only integrates structured data but can also integrate data from salesforce.com’s collaboration technology Chatter with enterprise systems. My colleague covered iWay’s cloud offerings and their integration with Chatter last year. Cast Iron Systems, acquired by IBM last year, also offers integration with Chatter as well as other structured data sources.
A relatively small company, Boomi has made a name in the cloud data integration space and as a result recently was acquired by Dell. Boomi uses a deployment architecture based around Java Virtual Machines (JVMs). The company calls these deployment units “atoms,” and because they are based on a JVM architecture, they can easily be deployed on-premises as well as in the cloud. These “atoms” can be run in parallel to enhance performance, throughput and scalability, but beware that the process to create multiple instances is a manual one today. Now that it has Dell’s backing, I expect to see this process built into the product to compete at the enterprise level.
SnapLogic, also competing in the cloud data integration space, has some good credentials. Founded by Gaurav Dhillon founder of Informatica, SnapLogic has focused on building a large developer community to create “Snaps” or connections to a variety of data sources. While this model makes sense as a way to get more connectivity, for an organization needing to connect to many different data sources it can become a costly alternative, especially when competitors like Pervasive offer connectors at no additional charge. But in the spirit of community centric software development like that found in open source market this could be an new approach in sharing interfaces build by customer, partners and software developers.
Jitterbit offers one of the more interesting applications of cloud data integration capabilities. You can read an assessment of their cloud-based data migration services here: Jitterbit has recognized that the cloud brings different challenges to the world of data integration. It introduced CloudReplicate which allows you to create a copy of your salesforce.com data in a separate RDBMS instance in the cloud. The vendor will keep this version in sync with the instance in salesforce.com. Customers use the replicated version for a variety of reasons ranging from simple backup to data federation to historical data analysis.
The market for cloud data integration products and services is still emerging. We have learned some lessons in the on-premises past that will be applicable to the cloud, and established vendors are aggressively pursuing these market opportunities through a combination of development efforts and acquisitions. We see new vendors entering the space like Jitterbit and SnapLogic. One common theme I heard repeated by large and small vendors alike is that cloud data integration is about frequent, smaller transfers of data rather than large bulk operations. Another common theme is that no vendor offers all the functionality of fully established on-premises solutions.
As an industry we’ve only begun to understand the challenges and opportunities that are unique to the cloud. This is an area where I’ll be focusing additional attention with some of my research efforts during 2011. Stay tuned for more information as we begin new benchmark research into the current use and market demand.
Let me know your thoughtsor come and collaborate with me on Facebook,LinkedInand Twitter.