Curated by: Luigi Canali De Rossi

Friday, August 24, 2007

Mashups: What Are They? Technical And Social Challenges - Part 2

Sponsored Links

Mashups are filled with technical challenges and that is why many technologists are attracted by them. The challenge of having to find new solutions to "old" technical issues while "inventing" new ways to mix and combine existing resources and tools is undoubtedly a very positive motivator for any developer out there.

Photo mashup by Robin and Nico Good - original photos credits - turntable / Dolphin Music - pizza / Palatino join-us

In this second part (Part 1) of this guide to Mashups by Duane Merrill, the focus is on the "Technical Challenges" and "Social Aspects" of this rapidly evolving field.

What to look out for, typical technical challenges and obvious bottlenecks are examined and explained in an introductory, though technically competent way. Good considerations regarding accessibility, SEO, security and other issues often underestimated during initial planning of any such project.

The social aspects are no less important in the creation of online mashups and in particular the tradeoff between the protection of intellectual property and consumer privacy versus fair-use and the free flow of information is a notable one.

As always, plenty of links help you learn and find out more about possible terms and technologies you may not be yet familiar with.

Intro by Robin Good

Photo credit: Sun - The Big Mashup

Mashups: The New Breed of Web App - Part 2

Technical Challenges

Mashup matrix - Photo credit: Vincent Thomé's blog

Like any other data integration domain, mashup development is replete with technical challenges that need to be addressed, especially as mashup applications become more feature- and functionality-rich.

This section touches on a handful of these challenges, some of which you can address and mitigate, while others are open issues.

Data Integration Challenges: Semantic Meaning and Data Quality

Qualitative surveys suggest that the number one enterprise IT concern today is data integration within the enterprise virtual organization. (In this context, I use the term virtual organization to mean a composition of federated business units, each contained within its own administrative domain.)

Like many enterprise IT managers who find themselves up to the task of integrating legacy data sources (for example, to create corporate dashboards that reflect current business conditions), mashup developers are faced with the analogous challenges of deriving shared semantic meaning between heterogeneous data sets. Therefore, to get an idea for what mashup developers have in store,you need look no further than the storied integration challenges faced by enterprise IT.

For example, translation systems between data models must be designed.

When converting data into common forms, reasonable assumptions often have to be made when the mapping is not a complete one (for example, one data source might have a model in which an address-type contains a country-field, whereas another does not). Already challenging, this is exacerbated by the fact that the mashup developers might not be domain experts on the source data models because the models are third-party to them, and these reasonable assumptions might not be intuitive or clear.

In addition to missing data or incomplete mappings, the mashup designer might discover that the data they wish to integrate is not suitable for machine automation; that it needs cleansing.

For example, law enforcement arrest records might be entered inconsistently, using common abbreviations for names (such as "mkt sqr" in one record and "Market Square" in another), making automated reasoning about equality difficult, even with good heuristics.

Semantic modeling technologies, such as RDF, can help ease the problem of automatic reasoning between different data sets, provided that it is built-in to the data-store.

Legacy data sources are likely to require much human effort in terms of analysis and data cleansing before they can be availed to semantic modeling technologies.

Mashup developers might also have to contend with several issues that IT integration managers might not, one of which is data pollution.

As part of their application design, many mashups solicit public user input. As evidenced in the wiki application domain, this is a double-edged blade as:

a) it can be quite powerful because it enables open contribution and best-of-breed data evolution,

b) yet it can be subject to inconsistent, incorrect, or intentionally misleading data entry. The latter can cast doubts on data trustworthiness, which can ultimately compromise the value provided by the mashup.

Another host of integration issues facing mashup developers arise when screen scraping techniques must be used for data acquisition.

As discussed in the previous section, deriving parsing and acquisition tools and data models requires significant reverse-engineering effort. Even in the best case where these tools and models can be created, all it takes is a re-factoring of how the source site presents its content (or mothballing and abandonment) to break the integration process, and cause mashup application failure.

Component Challenges

The Ajax model of Web development can provide a much richer and more seamless user experience than the traditional full-page-refresh, but it poses some difficulties as well.

At its fundamentals, Ajax entails using the browser's client-side scripting capabilities in conjunction with its DOM to achieve a method of content delivery that was not entirely envisioned by the browser's designers. (Perhaps this hack-like nature of Ajax lends to its appeal.) However, this subjects Ajax-based applications to the same browser compatibility issues that have plagued Web designers ever since Microsoft created Internet Explorer.

For example, Ajax engines make use of an XMLHttpRequest object to exchange data asynchronously with remote servers. In Internet Explorer 6, this object is implemented with ActiveX rather than native JavaScript, which requires that ActiveX be enabled.

A more fundamental requirement is that Ajax requires that JavaScript be enabled within the user's browser. This might be a reasonable assumption for the majority of the population, but there are certainly users who use browsers or automated tools that either do not support JavaScript or do not have it enabled. One such set of tools are the robots, spiders, and Web crawlers that aggregate information for Internet and intranet search engines. Without graceful degradation, Ajax-based mashup applications might find themselves missing out on both a minority user base as well as search engine visibility.

The use of JavaScript to asynchronously update content within the page can also create user interface issues.

Because content is no longer necessarily linked to the URL in the browser's address bar, users might not experience the functionality that they normally expect when they use the browser's BACK button, or the BOOKMARK feature. And, although Ajax can reduce latency by requesting incremental content updates, poor designs can actually hinder the user experience, such as when the granularity of update is small enough that the quantity and overhead of updates saturate the available resources.

Also, take care to support the user (for example, with visual feedback such as progress bars) while the interface loads or content is updated.

As with any distributed, cross-domain application, mashup developers and content providers alike will also need to address security concerns.

The notion of identity can prove to be a sticky subject, as the traditional Web is primarily built for anonymous access.

Single-signon is a desirable feature, but there are a multitude of competing technologies (ranging from Microsoft Passport to the Liberty Alliance), thus creating disjointed identity namespaces that you must integrate as well.

Content providers are likely to employ authentication and authorization schemes (which require the notion of secure identity or securely identifiable attributes) in their APIs to enforce business models that involve paid subscriptions or sensitive data.

Sensitive data is also likely to require confidentiality (that is, encryption), and you must take care when you mash it with other sources to not put it at risk.

Identity will also be crucial for auditing and regulatory compliance. Additionally, with data integration happening both on the server and client-side, identity and credential delegation from the user to the mashup service might become a requirement.

Social Challenges


In addition to the technical challenges described in the previous section, social issues have (or will) surface as mashups become more popular.

One of the biggest social issues facing mashup developers is the tradeoff between the protection of intellectual property and consumer privacy versus fair-use and the free flow of information.

Unwitting content providers (targets of screen scraping), and even content providers who expose APIs to facilitate data retrieval might determine that their content is being used in a manner that they do not approve of.

(For a good review of Web aggregation and regulations, see the Resources section at the end of this article.)

The mashup Web application genre is still in its infancy, with hobbyist developers who produce many mashups in their spare time.

These developers might not be cognizant of (or concerned with) issues such as security. Additionally, content providers are only beginning to see the value in providing APIs for machine-based content access, and many do not consider them a core business focus. This combination can yield poor software quality, as priorities such as testing and quality assurance take the backseat to proof-of-concept and innovation.

The community as a whole will have to work together to assemble open standards and reusable toolkits in order to facilitate mature software development processes.

Before mashups can make the transition from cool toys to sophisticated applications, much work will have to go into distilling robust standards, protocols, models, and toolkits.

For this to happen, major software development industry leaders, content providers, and entrepreneurs will have to find value in mashups, which means viable business models.

API providers will need to determine whether or not to charge for their content, and if so, how (for example, by subscription or by per-use). Perhaps they will provide varying levels of quality-of-service.

Some marketplace providers, such as eBay or Amazon, might find that the free use of their APIs increases product movement.

Mashup developers might look for an ad-based revenue model, or perhaps build interesting mashup applications with the goal of being acquired.

End of Part 2

Part 1 - Mashups: What Are They? Mashup Genres And Technologies - Part 1



Get products and technologies


About the author

Duane Merrill has developed grid computing and distributed data integration platforms for over five years. He has been a contributor to the Legion Project at the University of Virginia and a core developer for the Avaki Corporation's distributed enterprise information integration product Avaki. He is currently obtaining his Ph.D in Computer Science at the University of Virginia.

This article is copyright 2006 Backstop Media and has been republished with permission.

Duane Merrill -
Reference: IBM [ Read more ]
Readers' Comments    
blog comments powered by Disqus
posted by Robin Good on Friday, August 24 2007, updated on Tuesday, May 5 2015

Search this site for more with 








    Curated by

    New media explorer
    Communication designer


    POP Newsletter

    Robin Good's Newsletter for Professional Online Publishers  



    Real Time Web Analytics