Vous êtes sur la page 1sur 22

Extended Reach:

An Efficient Content Management Technique


for Sharing and Localizing Content

IBM Technical Report TR-40.0032


December, 2003

Sheila Monheit David Leip


IBM Corporate Webmasters IBM Corporate Webmasters
San Jose, CA, United States Hawthorne, NY, United States
Monheit@us.ibm.com Leip@us.ibm.com

Sara Elo Dean Hidekazu Shirayama


IBM Corporate Webmasters IBM Corporate Webmasters
Helsinki, Finland Tokyo, Japan
EloDean@fi.ibm.com Flyhard@jp.ibm.com
Table of Contents

1 Introduction............................................................................................................................. 3
2 Objectives ............................................................................................................................... 3
3 IBM URI Taxonomy............................................................................................................... 3
4 Approach................................................................................................................................. 4
4.1 ibm.com Content Model ................................................................................................. 5
4.2 Multi-Page Publish Scheme............................................................................................ 5
4.3 Enabling Localized Content............................................................................................ 6
4.4 Automating Country Code References ........................................................................... 8
4.5 Shared vs. Localized Text Blocks................................................................................. 13
4.6 Leadspace Rotation....................................................................................................... 17
4.7 Hybrid Approach .......................................................................................................... 18
5 Evaluating Extended Reach in Pilots.................................................................................... 19
5.1 Pilot 1: Basic extended reach with identical content .................................................... 19
5.2 Pilot 2: Enhanced extended reach with localized content............................................. 20
5.3 Pilot 2 Evaluation.......................................................................................................... 20
6 Future Work .......................................................................................................................... 21
1 Introduction
For global companies such as IBM, it is important from a marketing and brand perspective that
they represent themselves as being “in touch” with the many local national markets in which
they do business. This applies to all aspects and representations of the corporation, including
their web presence. In some cases these markets can be quite small, and it can be difficult to
justify the investment to create and maintain separate web content for each of these markets
individually. The alternative, simply grouping countries together and creating a single web site
for a region, is not particularly attractive. It leaves that set of end users feeling not on par with
the corporation’s larger markets.

A large corporate web site such as ibm.com is faced with the challenge to serve as wide a set of
customers as efficiently as possible. Two strategies exist for achieving this goal. The first is to
leverage the same content across different formats. For example, the ibm.com corporate news
content is shared across XHTML for the standard web browsers, WML, HDML, cHTML for
pervasive devices, and RSS for content syndication. The second approach is to share the same
content across different sites. This paper discussed the second approach, named Extended
Reach. Specifically, the paper explains the way IBM has set up multiple country portals that can
be managed, from a content maintenance perspective, as a single portal.

2 Objectives
The Extended Reach project has three main business goals:

1. To make ibm.com available on a wider basis world wide


2. To reduce the workload of maintaining country portals, especially for smaller countries
3. Flaunt the “I” (International) in IBM

IBM took the early lead in establishing a web presence for quite a few countries, more than its
competitors. In recent years some of its larger competitors (Dell, HP &
Microsoft) surpassed IBM, creating a web presence in more countries. With the
rollout of the Extended Reach project, IBM has regained the leadership position.
Today IBM presents a country portal in 83 countries, while Dell, HP and
Microsoft and other competitors cover fewer countries.

3 IBM URI Taxonomy


The IBM URI taxonomy centers on subject matter keywords in English and the ISO standard for
two-letter country and language codes [1]. These elements allow presenting a web site visitor
with consistent naming conventions across applications and web sites worldwide.

Examples:

• http://www.ibm.com/ibm/au (About IBM in Australia)


• http://www.ibm.com/news/ve (News in Venezuela)
• http://www.ibm.com/servers/de (Servers in Germany)
If more than one language is used for a country, URIs follow the /<cc>/<lc> format where <cc>
is two-letter code as specified in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code
elements and <lc> is the two character language code.

Examples:

• http://www.ibm.com/e-business/ch/fr (e-business in Switzerland, French version)


• http://www.ibm.com/products/ca/fr (Products & Services in Canada, French version)

Top-level, or root level, directories are restricted to IBM registered trademarks and service
marks, and major, global, cross divisional content areas such as /e-business, /thinkpad, /services
and /products. These keywords must be in English only. For worldwide consistency, URIs are
not translated to the local language. Use of regional web sites and regional URIs is strongly
discouraged.

Furthermore, if consistent URIs do not or cannot be implemented due to application constraints


for strategic pages, the ibm.com web servers are configured with redirects so that the advertised
URI still abide to the URI taxonomy.

Examples:

• http://www.ibm.com/shop/it/customerservice (Online customer support Italy) redirects to


http://www-
134.ibm.com/webapp/wcs/stores/servlet/HelpDisplay?subject=2294556&storeId=380&catalogI
d=-380&langId=-4
• http://www.ibm.com/shop/uk/help (Online shopping support UK) redirects to http://www-
134.ibm.com/webapp/wcs/stores/servlet/HelpDisplay?storeId=826&catalogId=-
826&dualCurrId=20&langId=826&subject=2294556

The Extended Reach technique builds on the fact that the IBM web URI taxonomy is country
code centric. URIs between corresponding pages for countries vary in general only by country
code. This enables URIs to be programmatically localized for countries within a group.

4 Approach
The Extended Reach technique is applicable for a group of country web sites with the following
criteria:
• Maximum content sharing across multiple countries. The goal is to share most of the
content that makes up the web site, with only a small amount of unique information
maintained separately for each country. Allow for variation in content where a country
has a local business need.
• Group similar small market countries together based on common language and region.
For example:
o 20 Caribbean English language countries
o 7 ASEAN English language countries
Due to translation issues, it is not possible to share content between different languages.
• Enforce a standard layout.
• Support rotation of content to give a greater sense of freshness and even uniqueness
across countries.
• Comply with a standard URI taxonomy to enable the automated localization of standard
URIs.
• Cater for automated country name substitution, but with care.

4.1 ibm.com Content Model


Today, a content management system based on the Extensible Markup Language (XML) is used
to create and maintain ibm.com country portals. By encoding content in XML and layout logic
in the Extensible Stylesheet Language (XSL), the system enforces the separation of content and
presentation. The system also supports reusable XML fragments and manages the dependencies
between such fragments. Using a Java-based user interface, a content editor can upload XSL
stylesheets and multimedia objects, create and edit XML content fragments, compose pages out
of fragments, preview pages, review final published pages, and reject them or promote them to
the final stage in the publishing flow [2].

Every ibm.com web page consists of several fragments: a masthead, footer, left and right
navigation bars, and the main white space. Each of these is built as a separate XML fragment
included into one or more XML documents, or servables. The XML fragments and servables
abide to Document Type Definitions (DTDs). Fragments correspond to reusable components
such as a navigation bar, an image, or a link, and servables to specific page types, such as an
index page, a homepage, or a news article. An XML servable may contain fragments that are
unique to the white space of the page type or reusable fragments. An XML servable is
transformed to output pages in various formats by dedicated XSL stylesheets that control the
presentation of a page. Thus content input and output presentation are tightly controlled by the
appropriate servable and fragment DTDs and the XSL stylesheets.

4.2 Multi-Page Publish Scheme


For countries not within Extended Reach, ibm.com corporate portal country pages are generated
on a 1-1 basis. One input XML servable transformed with one XSL generates one output page
(in HTML, WML, HDML, or RSS format) for one country in one language. Thus, ten XML
servables tagged for ten different countries are transformed by one XSL stylesheet, generating
ten resulting pages. In this way the IBM standard layout, along with the tight DTD control over
the page content, are ensured across every country portal page.

Extended Reach presented the challenge of creating more than one output page from one XSL
transformation of one input XML. The input XML was now a fully reusable XML servable,
made up already reused fragments.

The existing content model and content were analyzed to identify how content could be
efficiently shared across a group of countries. Countries that share a common language and
common content could be grouped together.
The first design introduced no changes to the DTDs in order to avoid the maintenance of two sets
of DTDs, one set for countries with unique content and one for the Extended Reach countries
with identical content. The Extended Reach technique was implemented as a multi-page
publishing scheme in the XSL stylesheets. The existing XSL logic was enhanced to include a
looping mechanism. The new logic could generate multiple outputs from a single XML and
result in a distinct ibm.com corporate portal page for each specified target country. The output
pages were identical in content, apart from the automated localization of the masthead, footer
and URIs.

Once the groupings of countries had been identified, rendering the countries to IBM standard
layouts became very straightforward. Within every XML servable is a COUNTRY element tag,
which specifies the target country page being generated. By adding this tag multiple times, the
stylesheet can process any number of countries.

Single country tagging:


<COMMON>
<LANGUAGE >en</LANGUAGE>
<COUNTRY>bd</COUNTRY>
</COMMON>

Multiple country tagging:


<COMMON>
<LANGUAGE >en</LANGUAGE>
<COUNTRY>bd</COUNTRY>
<COUNTRY>lk</COUNTRY>
<COUNTRY>vn</COUNTRY>
<COUNTRY>ph</COUNTRY>
<COUNTRY>my</COUNTRY>
<COUNTRY>th</COUNTRY>
<COUNTRY> id</COUNTRY>
<AUDIENCE >all</AUDIENCE>
</COMMON>

An XML servable also contains the STYLESHEET tag, which identifies the XSL stylesheet to
transform with:

<STYLESHEET>regional_newsindex_xml_html.xsl</STYLESHEET>

4.3 Enabling Localized Content

The first Extended Reach implementation successfully created multiple near-identical,


automatically localized output pages and enforced the IBM layout standard. However,
the approach was too rigorous: identical pages left no room for unique country
distinctions. Some ASEAN Extended Reach candidate countries were unable to adopt
the technique because the design did not allow for any localization on the pages. An
enhanced design needed to allow for some custom content identification within the
existing page structures defined in the DTDs.

A content analysis of countries in the same region provided insight into the localization
requirements. Fig 1 and Fig 2 show the www.ibm.com homepages for Malaysia and
Indonesia:
Fig 1: www.ibm.com/my

www.ibm.com/planetwide/select

www.ibm.com/my/offers/thinkpad/

Every link on the page refers either to a country-specific


page or a www.ibm.com general page. The country code
occurs anywhere within the URI, or not at all. www.ibm.com/services/my/

Fig 2: www.ibm.com/id

www.ibm.com/planewtwide/select

Leadspace views rotate per hit, for every


country.

www.ibm.com/services/bcs/id/

www.ibm.com/services/id/
Fig 3: Services links

Malaysian Homepage Services section: Indonesian Homepage Services section:

Optional link to: No optional


www.ibm.com/financing/my link

Further investigation of content, such as lists of links in Fig 3, reveals the following:

• The URI taxonomy is consistent within defined sections on a page, so enabling


country references can be automated.
• Some links appear only for a subset of countries in a group, so country tagging
of a link must be enabled
• Some text blocks are identical across all countries with the exception of the
local country name, so enabling automatic country references within text could
be enabled.

4.4 Automating Country Code References


Before Extended Reach, links, such as the ones in Fig 3, were defined in XML as ITEM_TITLE
and ITEM_URL element pairs. The sample below defines the left navigation bar on the
www.ibm.com/us homepage:

<PRIMARY_LINKS>
<ITEM>
<ITEM_TITLE>Home / home office</ITEM_TITLE>
<ITEM_URL>http://www.ibm.com/homeoffice/</ITEM_URL>
</ITEM>
</PRIMARY_LINKS>
<PRIMARY_LINKS>
<ITEM>
<ITEM_TITLE>Small & medium business</ITEM_TITLE>
<ITEM_URL>http://www.ibm.com/businesscenter/us/<ITEM_URL>
</ITEM>
</PRIMARY_LINKS>
<PRIMARY_LINKS>
<ITEM>
<ITEM_TITLE>Large enterprise</ITEM_TITLE>
<ITEM_URL>http://www.ibm.com/largeenterprise/us/</ITEM_URL>
</ITEM>
</PRIMARY_LINKS>
The transforming XSL loops over all the PRIMARY_LINK elements and generates the
following output html:

http://www.ibm.com/hom
http://www.ibm.com/businesscenter/us/
http://www.ibm.com/largeenterprise/us/

Based on the definition in the XML, all the links point to US URIs and every title and URI pair
is included in the output. No mechanism, or need, exists to specify conditions of links, such as
their presence or absence in the output, because the navigation bar is dedicated to the US.

The following patterns were defined to enable flexible localization of links. Content and XSL
stylesheets were enhanced to respectively include and process the new logic.

%%CC substitute every country (cc) listed under


<COMMON/COUNTRY> in the URI string
%%INCLIST_cc_cc_%% substitute ONLY countries included in the INCLIST
string
[[%%INCLIST_cc_cc_%%]] include this link (which contains no CC references
at all, ex:www.ibm.com) for countries in the
INCLIST (note: this string is added at the end of
the URI string)
%%EXCLIST_cc_cc_%% substitute ONLY countries NOT included in the
EXCLIST string
[[%%EXCLIST_cc_cc_%%]] include this link (which contains no CC references
at all, ex:www.ibm.com) for countries NOT
included in the EXCLIST (note: this string is added
at the end of the URI string

Going back to the sample services section in Fig 3, the XML for that section in the new syntax
becomes:
<SERVICES_BOX>
<SERVICES_GRAY_TITLE>Services</SERVICES_GRAY_TITLE>
<SERVICES_LINKS>
<LINK_TEXT>Business and IT services</LINK_TEXT>
<LINK_URL>http://www.ibm.com/services/%%CC/</LINK_URL>
</SERVICES_LINKS>
<SERVICES_LINKS>
<LINK_TEXT>Business consulting services</LINK_TEXT>
<LINK_URL>http://www.ibm.com/bcs/%%CC/</LINK_URL>
</SERVICES_LINKS>
<SERVICES_LINKS>
<LINK_TEXT>Infrastructure services</LINK_TEXT>
<LINK_URL>http://www.ibm.com/services/%%CC/strategy/capability/fullin
fra.html</LINK_URL>
</SERVICES_LINKS>
<SERVICES_LINKS>
<LINK_TEXT>On demand services</LINK_TEXT>
<LINK_URL>http://www.ibm.com/services/%%CC/ondemand/</LINK_URL>
</SERVICES_LINKS>
<SERVICES_LINKS>
<LINK_TEXT>Financing</LINK_TEXT>
<LINK_URL>http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/</LINK_URL>
</SERVICES_LINKS>
</SERVICES_BOX>

Further down in the same XML servable the country definitions are:

<COMMON>
<LANGUAGE>en</LANGUAGE>
<COUNTRY>my</COUNTRY>
<COUNTRY>ph</COUNTRY>
<COUNTRY>th</COUNTRY>
<COUNTRY>id</COUNTRY>
</COMMON>

The output seen in Fig 3 for the Malaysian and Indonesian homepages is generated by the
Extended Reach XSL below:

<xsl:template name="regionalLinks">
<xsl:param name="cc"/>
<xsl:param name="link"/>
<xsl:choose>
<xsl:when test="contains($link,'%%CC')">
<xsl:value-of select="concat(substring-
before($link,'%%CC'),$cc,substring-after($link,'%%CC'))"/>
</xsl:when>
<xsl:when test="contains ($link,'%%INCLIST_')">
<xsl:variable name="IncList" select="substring-before (substring-
after ($link, '%%INCLIST_'), '%%')"/>
<!--xsl:value-of select="concat('this is dolist variable:',
$doList)"/-->
<xsl:choose>
<xsl:when test="contains ($IncList, $cc)">
<xsl:choose>
<xsl:when test="contains ($link, '[[%%INCLIST_')">
<xsl:value-of select="substring-before($link,
'[[%%INCLIST')"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat(substring-before($link,
'%%INCLIST_'),$cc,substring-after($link, '_%%'))"/>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="''"/>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:when test="contains ($link,'%%EXCLIST_')">
<xsl:variable name="ExcList" select="substring-before (substring-
after ($link, '%%EXCLIST_'), '%%')"/>
<!--xsl:value-of select="concat('this is dolist variable:',
$doList)"/-->
<xsl:choose>
<xsl:when test="contains ($ExcList, $cc)">
<xsl:value-of select="''"/>
</xsl:when>
<xsl:otherwise>
<xsl:choose>
<xsl:when test="contains($link, '[[%%EXCLIST_')">
<xsl:value-of select="substring-before($link,
'[[%%EXCLIST')"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="concat(substring-before($link,
'%%EXCLIST_'),$cc,substring-after($link, '_%%'))"/>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$link"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

A detailed explanation of the XSL follows:

The XSL template gets passed 2 parameters from the parent routine:
1. cc, which is the country code of the pass it is performing under the FOR-EACH
loop for COMMON/COUNTRY:
<COMMON>
<LANGUAGE>en</LANGUAGE>
<COUNTRY>my</COUNTRY>
<COUNTRY>ph</COUNTRY>
<COUNTRY>th</COUNTRY>
<COUNTRY>id</COUNTRY>
</COMMON>
In the first pass cc=my (Malaysia), then ph (Philippines) and so on.

2. link, which is the string containing the URI information, the contents
of the <LINK_URL> element:
<LINK_URL>http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/</
LINK_URL>.
The template above is executed within the COMMON/COUNTRY for-
each loop N times, once for each time a URI requires processing. In this
example, the cc variable does not change values until all the LINK_URL
elements are processed. At that point the cc variable is assigned the value
of the next COUNTRY element and the processing for each LINK_URL
is repeated.

The links being processed are in order:


1. http://www.ibm.com/services/%%CC/
2. http://www.ibm.com/bcs/%%CC/
3. http://www.ibm.com/services/%%CC/strategy/capability/fullinfra.html
4. http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/
The following XSL logic occurs for each pass through these links:

For the first three links (1, 2 and 3) the value of cc, the country being processed, is substituted
directly into the link string at the exact location of the %%CC notation. Thus, when processing
the first COMMON/COUNTRY element my, the first three links print as

http://www.ibm.com/services/my/
http://www.ibm.com/bcs/my/
http://www.ibm.com/services/my/strategy/capability/fullinfra.html

and when processing the second COMMON/COUNTRY element ph, the same links print as
http://www.ibm.com/services/ph/
http://www.ibm.com/bcs/ph/
http://www.ibm.com/services/ph/strategy/capability/fullinfra.html

The processing of the fourth link is more complicated.


http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/

When the XSL encounters the %%INCLIST or %%EXCLIST pattern, it triggers two
conditional loops:

1. First, it parses the string until the closing _%% to see whether or not the current cc variable is
relevant for this string. In this case, Malaysia (my), Thailand (th), and Philippines (ph)
homepages should all include this URI. Indonesia (id) homepage should not include it.

This could also have been represented as:


http://www.ibm.com/financing/%%EXCLIST_id_%%/

and would have produced the same results. For the EXCLIST pattern, the conditional loop parses
the string to see if the current cc is NOT in the list, and if so, the link is included.

2. Second, if it is established that the URI string is applicable for the current cc variable, the next
conditional test determines if the URI string contains a country reference in its syntax, or
whether it’s a general ibm.com URI that has no country reference in it at all. This test performs
a second parse on the INCLIST or EXCLIST patterns to determine if the INCLIST or EXCLIST
pattern is at the end of the URI string, and if so, if the [[ opening and ]] closing brackets surround
it. This indicates that the URI string does include a country reference.

An example is the pattern:

http://www.lotus.com/[[%%INCLIST_my_%%]]

which prints out the link without any country code http://www.lotus.com/ on the Malaysian page
only.
Last, if the string being processed with the INCLIST or EXCLIST pattern is not applicable to the
current cc variable, the XSL returns a blank string. This is necessary for later processing when
the URI and TITLE are both processed for the final output. The TITLE is always included in the
input XML, regardless of country tagging, so to ensure that no TITLE without a corresponding
URI is inserted into output HTML, a blank string is required for a last test before the HTML
output is created. If the returned URI string is blank, no TITLE/URI combination is included in
the HTML; if it isn’t blank, the returned string, now containing the correct country tags, along
with the corresponding TITLE, is included in the HTML.

4.5 Shared vs. Localized Text Blocks


A comparison of About IBM pages provides a good example of the types of text blocks shared
among and localized by countries.

Fig 4: www.ibm.com/ibm/my

Text block that all countries


share. May include country name Localized photo
in the text. (optional)

Shared financial info, additional section for


localized financial info allowed

Text block with localized


information

Fig 5: www.ibm.com/ibm/id
Text block that all countries
share. May include country name Localized photo
in the text. (optional)

Shared financial info, additional section for


localized financial info allowed

Text block with localized


information

The examples in Figs 4 and 5 illustrate different types of text blocks, namely shared and
localized. A shared text block is reusable, but requires some processing to allow for minor
localization in order to give the text a country specific feel. For example, in the first section, it
would be ideal if a country could use the general text, and insert one or more localized sentences.

A localized text block is specific to a country only. For example: the history of IBM in the
country, the picture of the local general manager, or the contact information for the country
shown in Fig 6.

Fig 6: www.ibm.com/ibm/my continued


For the shared text block, the text processing XSL template is modified to accept a TAG that
serves as a placeholder and country identifier within a text block, (%%COUNTRYNAME).
Using standard XSL, this text processing template can be invoked, using the country name
(Malaysia). The XSL processing is a standard text substitution/replacement template, one that
recursively parses a sting and substitutes any instance of TAG with the passed in parameter
value.

A less obvious, but equally beneficial, outcome of this first type of text substitution is its
application to the HTML Meta tags:

Malaysian Meta tags:


<meta name="IBM.Country" content="my"/>
<meta name="Description" content="The IBM Malaysia home page, entry point to
information about IBM products and services."/>
<meta name="Abstract" content="The IBM Malaysia home page, entry point to
information about IBM products and services."/>

The corresponding Indonesian Meta tags:


<meta name="IBM.Country" content="id"/>
<meta name="Description" content="The IBM Indonesia home page, entry point
to information about IBM products and services."/>
<meta name="Abstract" content="The IBM Indonesia home page, entry point to
information about IBM products and services."/>

The second type, the localized text block, requires a change beyond the Extended Reach
approach described so far where only XSL processing and content are enhanced. Minor DTD
changes need to be introduced to accommodate the inclusion of localized blocks of text in an
XML servable.

The DTD for the About IBM page, along with all the other portal pages, already accommodates
the inclusion of reusable XML fragments.

The root element for About IBM DTD:

<!ELEMENT ABOUT_IBM (SYSTEM,TITLE,TITLE_GRAPHIC?, LONG_DESCRIPTION?,


SITE_SECTION, LEFT_NAVBAR, PHOTO?, PHOTO_URL?, CAPTION?, BLUE_TITLE?
COMPANY_INFO?, COUNTRY_COMPANY_INFO? CONTACT_INFO?, FINANCIAL?,
ADDITIONAL_INFO*, INLINE_ELEMENTS?, PUBLISHINFO+, COMMON, META_INFORMATION)>

In this example, the underlined elements are subfragments, reusable pieces of XML that can be
included in the full About IBM XML servable. To accommodate the requirements for localized
text blocks, the About IBM DTD was modified to create the Regional About IBM DTD:

<!ELEMENT REGIONAL_ABOUTIBM (SYSTEM, TITLE,TITLE_GRAPHIC?


LONG_DESCRIPTION?, SITE_SECTION, LEFT_NAVBAR, PHOTO_SECTION*, BLUE_TITLE?,
COMPANY_INFO?, COUNTRY_COMPANY_INPUT*, CONTACT_INFO*,
FINANCIAL*,ADDITIONAL_INFO*, INLINE_ELEMENTS?, PUBLISHINFO+,COMMON,
META_INFORMATION)>

The difference between the two versions of the DTD are the additional fragment elements in the
regional version: PHOTO_SECTION and COUNTRY_COMPANY_INPUT. In addition, some
of the fragments formerly defined as ‘cardinality zero or one’ (?) were modified to ‘cardinality
zero or more’ (*).

These changes provide the ability to include separate XML fragments for the localized text
blocks. For example, in the regional About IBM servable created for Bangladesh (bd), Sri Lanka
(lk), Vietnam (vn), Philippines (ph), Malaysia (my), Thailand (th) and Indonesia (id), the
following XML fragments are included:

<PHOTO_SECTION SUBFRAGMENTTYPE="COUNTRY_PHOTO”>
<COUNTRY_PHOTO>
. . .
<COMMON DATATYPE="NOLABEL">
<LANGUAGE DATATYPE="ASSOCLIST">en</LANGUAGE>
<COUNTRY DATATYPE="ASSOCLIST">my</COUNTRY>
<AUDIENCE DATATYPE="ASSOCLIST" LINKABLE="AUDIENCE">all</AUDIENCE>
</COMMON>
</COUNTRY_PHOTO>
</PHOTO_SECTION>

<PHOTO_SECTION SUBFRAGMENTTYPE="COUNTRY_PHOTO”>
<COUNTRY_PHOTO>
. . .
<COMMON DATATYPE="NOLABEL">
<LANGUAGE DATATYPE="ASSOCLIST">en</LANGUAGE>
<COUNTRY DATATYPE="ASSOCLIST">ph</COUNTRY>
<AUDIENCE DATATYPE="ASSOCLIST" LINKABLE="AUDIENCE">all</AUDIENCE>
</COUNTRY_PHOTO>
</PHOTO_SECTION>

<PHOTO_SECTION SUBFRAGMENTTYPE="COUNTRY_PHOTO”>
<COUNTRY_PHOTO>
. . .
<COMMON DATATYPE="NOLABEL">
<LANGUAGE DATATYPE="ASSOCLIST">en</LANGUAGE>
<COUNTRY DATATYPE="ASSOCLIST">id</COUNTRY>
<AUDIENCE DATATYPE="ASSOCLIST" LINKABLE="AUDIENCE">all</AUDIENCE>
</COUNTRY_PHOTO>
</PHOTO_SECTION>

<PHOTO_SECTION SUBFRAGMENTTYPE="COUNTRY_PHOTO”>
<COUNTRY_PHOTO>
. . .
<COMMON DATATYPE="NOLABEL">
<LANGUAGE DATATYPE="ASSOCLIST">en</LANGUAGE>
<COUNTRY DATATYPE="ASSOCLIST">th</COUNTRY>
<AUDIENCE DATATYPE="ASSOCLIST" LINKABLE="AUDIENCE">all</AUDIENCE>
</COUNTRY_PHOTO>
</PHOTO_SECTION>

<PHOTO_SECTION SUBFRAGMENTTYPE="COUNTRY_PHOTO”>
<COUNTRY_PHOTO>
. . .
<COMMON DATATYPE="NOLABEL">
<LANGUAGE DATATYPE="ASSOCLIST">en</LANGUAGE>
<COUNTRY DATATYPE="ASSOCLIST">vn</COUNTRY>
<AUDIENCE DATATYPE="ASSOCLIST" LINKABLE="AUDIENCE">all</AUDIENCE>
</COUNTRY_PHOTO>
</PHOTO_SECTION>

Similar sections of XML exist for other selected subfragment types


such as COUNTRY_COMPANY_INPUT and CONTACT_INFO.

This is a collapsed view of the XML containing only the ID of each included subfragment. Note
the different number of fragments of each type due to the fact that localized fragments are of
cardinality zero or more.

Expanding any of the subfragments reveals the XML elements that identify the applicable
country.

<COMMON DATATYPE="NOLABEL">
<LANGUAGE DATATYPE="ASSOCLIST">en</LANGUAGE>
<COUNTRY DATATYPE="ASSOCLIST">ph</COUNTRY>
<AUDIENCE DATATYPE="ASSOCLIST" LINKABLE="AUDIENCE">all</AUDIENCE>
</COMMON>

During XSL processing of this servable, within the for-each loop for the servable
COMMON/COUNTRY, a test is performed to verify the existence of a localized fragment and
its applicability to the current cc variable. If the cc variable of the servable matches the cc
variable of the fragment, the contents of the XML fragment are included in the generation of the
output.

This test within the XSL is shown below:

<xsl:if test="boolean(../../COUNTRY_COMPANY_INPUT
[COUNTRY_COMPANY_INFO/COMMON/COUNTRY=$cc])" >
<xsl:apply-templates
select="../../COUNTRY_COMPANY_INPUT[COUNTRY_COMPANY_INFO/COMMON/COUNTRY=$cc] ">
<xsl:with-param name="directoryPrefix" select="$directoryPrefix"/>
<xsl:with-param name="countryName" select="$countryName"/>
<xsl:with-param name="cc" select="$cc"/>
</xsl:apply-templates>
</xsl:if>

4.6 Leadspace Rotation

The www.ibm.com homepages have a unique set of criteria: the ability to display rotating
leadspace fragments at the top of the white space for each homepage. This feature is shown in
Figures 1 and 2, where the leadspaces differ between the Malaysian and Indonesian homepages.
This feature is enabled by the homepage engine, which is run for every www.ibm.com
homepage, regardless of the manner in which it was created. No modifications were required of
the homepage engine to enable it to be used with the extended reach model. However, the use of
the engine with the extended reach model adds another level of uniqueness to each country page
generated from only one XML source.
4.7 Hybrid Approach

During the first phase of the Extended Reach project, the design was restricted to an
implementation that would not require modification of the existing DTDs. Not having to rebuild
existing content was a major consideration. For the most part, existing DTDs could
accommodate content for the multi-publish output model.

In the second phase of Extended Reach, an opportunity rose to add a new set of pages, with no
existing DTDs, into the www.ibm.com Corporate Portal: the Software pages for the ASEAN
countries. Since there were no existing DTDs for this set of pages, a completely new design
could be implemented, limited only by the restrictions set by the content management system.

The design team decided that combining the earlier approach with some modifications works
best. The %%CC notation within an XML tag is still used as a placeholder for country code
substitutions. However, rather than using the inclusion/exclusion notation within the XML tag,
e.g.

http://www.ibm.com/financing/%%INCLIST_my_th_ph_%%/

editors add discrete country tags in the XML to identify the applicable countries. This approach
makes content preparation simpler and less error prone for editors. They can choose a country
from a dropdown list rather than typing out a string in the defined syntax for each URI.
Furthermore, this approach is more consistent with standard XML tagging, as it separates the
URI from the country restrictions set upon it.

<PHOTO_SECTION SUBFRAGMENTTYPE="COUNTRY_PHOTO">
<COUNTRY_PHOTO>
<TITLE DATATYPE="STRING" LINKABLE="TITLE">asean Software Home #Lead Image - IBM
Lotus Workplace</TITLE>
<PHOTO DATATYPE="STRING" SUBFRAGMENTTYPE="IMAGE">
<IMAGE>

</IMAGE>
</PHOTO>
<PHOTO_URL>
<ITEM_URL>http://www.ibm.com/software/%%CC/lotusworkplace/</ITEM_URL>
</PHOTO_URL>
<COMMON DATATYPE="NOLABEL">
<LANGUAGE DATATYPE="ASSOCLIST">en</LANGUAGE>
<COUNTRY DATATYPE="ASSOCLIST">id</COUNTRY>
<COUNTRY DATATYPE="ASSOCLIST">ph</COUNTRY>
<AUDIENCE DATATYPE="ASSOCLIST" LINKABLE="AUDIENCE">all</AUDIENCE>
</COMMON>
</COUNTRY_PHOTO>
</PHOTO_SECTION>

In the example above the PHOTO_SECTION element is a fragment tagged to work for id
(Indonesia) and ph (Philippines) only. It is not applied for the other top-level country tags that
denote the overall applicability of the page.

Note the element


<ITEM_URL>http://www.ibm.com/software/%%CC/lotusworkplace/</ITEM_URL>

and the following elements

<COUNTRY DATATYPE="ASSOCLIST">id</COUNTRY>
<COUNTRY DATATYPE="ASSOCLIST">ph</COUNTRY>

The XSL processes the ITEM_URL tag and includes it only for ID and PH.

5 Evaluating Extended Reach in Pilots


The first Extended Reach pilot supported the output of identical pages with minimal automated
localization. The second Enhanced Extended Reach pilot supported localized content within
otherwise identical pages.

5.1 Pilot 1: Basic extended reach with identical content


The Extended Reach functionality was first rolled out in the fall of 2002 for two groups: twenty
Caribbean English speaking countries and three ASEAN English speaking countries. At the time,
the definition of the Extended Reach technique was strict: the countries in an Extended Reach
group had to share identical content. The only localization the model allowed was the automatic
replacement of the ISO country code in URIs. Each portal page was otherwise identical across
the countries with automatically localized masthead and footer.

This model proved to fit the very lowest resource countries, where little or no localized content
existed. Such country portals consisted of little other than the minimum 9 required top-level
pages and a sufficient flow of news articles to keep the news section up to date. The three
ASEAN countries that adopted this technique, namely Bangladesh, Sri Lanka, and Vietnam, as
well as the twenty Caribbean countries, did benefit from the feature to an extent. A single update
in the content management system published out to three web pages, thus reducing the time and
money required to keep the sites fresh. An additional benefit was the reduced time required to
launch new sites. The twenty Caribbean country portals did not exist before Extended Reach.
Their parallel launch took less than one hour, instead of roughly twenty times that if each one
was managed and launched as a separate portal.

However, when evaluating whether the pilot had resulted in improvements to the site quality, it
became obvious that the Extended Reach approach did not solve the problem of content creation.
The countries have so little resource that even uploading news articles of regional relevance
written and published for the larger ASEAN or Americas markets could not be done.

This result led to the re-examination of the Extended Reach model itself. A second round of
requirements was gathered from the ASEAN web management. Each Extended Reach group of
countries clearly needs at least one country with sufficient funds to create fresh content on an on-
going basis. As the content is uploaded into the content management system, the other countries
in the same group immediately benefit from the content updates. The question to the ASEAN
team was: How does the restriction on identical content need to be relaxed in order to
accommodate countries with localized content into the same Extended Reach group?
5.2 Pilot 2: Enhanced extended reach with localized content
During the summer of 2003, the ASEAN web management articulated the requirements for
including Indonesia, Malaysia, Philippines, and Thailand into the existing Extended Reach
group. The web manager analyzed each page and stated their need for localization, i.e. which
areas of the page needed to be optional and filled in with content for a subset of the countries in
the group.

For example, one requirement was:

“The right hand navigation modules on the Products & Services page need the ability to be
localized as they are used to link to features that do not exist for all countries.”

In order to build the rules for a more general approach to allow for future modification, each rule
was generalized. For example, the specific requirement above turned into the following rule in
the content model:

“The element called related_info in the portal DTDs must be able to be tagged for one, some, or
all of the countries in the Extended Reach group, and should only appear on the output pages for
the tagged countries.”

The technique described in Section 4.5 enables this functionality today for all pages, not only the
Products & Services page.

5.3 Pilot 2 Evaluation


Enhanced Extended Reach for ASEAN Country Portals was successfully deployed on September
24th 2003 for the following 7 countries: Malaysia, Indonesia, Philippines, Thailand, Sri Lanka,
Bangladesh, Vietnam. The improvement of 85.8% in time, and thus in web site maintenance
cost, was achieved for news articles.

Enhanced Extended Reach for the ASEAN Software Portal was successfully deployed on
October 8th 2003 for the following 5 countries: India, Singapore, Malaysia, Thailand,
Philippines.

Quotes from ASEAN team on reduced workload:

From Yee Nam Sng, ASEAN Site manager:


“ The News section provides the most savings and efficiency. This is because most
news articles are replicated without any change (except for local URIs) across all
countries.”
“Homepage marketing modules provide savings. We are able to achieve faster
turnaround and some savings by planning our updates and marketing modules across
ASEAN carefully.”

From AP Creative Services editors:


“Out of the three Enhanced Extended Reach implementations, the news fragments gain
the most benefits. Although it might only save around 15 minutes per news/country,
it has saved us from tedious job to replicate the content and manually reposition the
tiers. It also has limited the chance of errors. Publishing is now a bliss. Less
fragments to load, review and publish.”

“ The psychological efficiency is what we feel most. It's really tedious to duplicate
the same thing over and over again. This Enhanced Extended Reach approach has
increased the "Morale" of the editor by taking off these duplicate tasks.”

6 Future Work
Given the success of the Enhanced Extended Reach model, and the demonstrated cost savings it
has resulted in, new country groupings will certainly be created. Some candidates include
regions where ibm.com does not yet have existing country portals:
• Americas Spanish
• Middle East Arabic
• Africa English
• Africa French

Another direction is to apply the same technique for new sets of pages, much like was done for
Software pages for the ASEAN countries. In addition, the results of the Software page pilot will
certainly provide lessons learned to ibm.com Software group on how best to include the software
portals worldwide in the same framework.

Yet another approach is to multi-publish output pages from one XML source regardless of pre-
determined country groupings. For example, the legal statements for many IBM countries are the
same, regardless of the region or size of market. XML pages for wireless.ibm.com are generated
using this approach.

7 Acknowledgements
The authors wish to thank Dikran Meliksetian, Rosa Bolger, Lisa Intravio Chris Wang and
Marie Shafi who helped put the methods described in this paper into practice, and who have
consistently supported and contributed to its further development.

8 References
[1] ISO web site at http://www.iso.ch/iso/en/prods-services/iso3166ma/02iso-3166-code-
lists/list-en1.html

[2] "XML Content Management: Challenges and Solutions" XML Europe 2001 Nianjun Zhou,
Dikran Meliksetian, Louis Weitzman, Sara Elo Dean, Jeff Milton, Peter Davis, Jessica Wu. May
2001

Vous aimerez peut-être aussi