Strings

If you are used to Perl or similar string-processing oriented languages, you will have noted by now that XSLT only has basic string-processing capabilities built-in. XSLT is optimized to deal with markup tags, and not to deal with strings.

However, XML data is in string format, so string-processing is often required to manipulate the data for the purposes of transformation. Thankfully, we can either use a Java wrapper around XSLT to implement heavy duty string-processing externally to the XSLT or we can implement the string-processing directly in XSLT. In this class we will look at the second approach. Central to implementing this functionality in XSLT is the concept of recursive algorithms.

Note that there is an Internet-based community initiative to provide a standard library of extentions to XSLT.

ends-with() function

XSLT has a starts-with() function, but no ends-with(). Here is how to test if something is at the end of a string:

  substring( $value, ( string-length( $value ) - string-length( $substr ) ) + 1 ) = $substr

Things to note:

  1. XSLT indexes start at one (1), and not zero (0)
  2. this function returns either a value of true or a value of false, not the substring itself

Strip whitespace from a string

A common task is to remove all whitespace characters from a string. Here is one way to do that:

  translate( $input, " 	

", "" )
Discussion Points
  1. How could we generalize this function?
  2. What characters do 	 and 
 and 
 match?

Numbers & Math

Baic XSLT just gives you basic number and math functionality, but recursion rides to the rescue again. Basic arithmetic, counting, summing, formatting numbers, all are covered by XSLT. Everything else needs to be created. If you need to do fancy or complicated number crunching, then bring the data into something like a Java program and do it there.

Note the page called Gallery of Stupid XSL and XSLT Tricks. A lot of solutions, including some fairly complicated math, can be found there for your use.

Examples of Common Math Functions

Sometimes we need to go past gradeschool math, with things like this:

AbsoluteValue:
<xsl:template name="math:abs">
  <xsl:param name="x"/>
  <xsl:choose>
    <xsl:when test="$x &lt; 0">
      <xsl:value-of select="$x * -1"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$x"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>
Median

Statisticians use three different kinds of statistics which we might think of as averages: mean, median, and mode. Let us look at an example solution for median first.

Sort a set of numbers, and the median is the one falling in the middle of that sorted-set. Note that the median must be a number.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:math="http://www.ora.com/XSLTCookbook/math">



<xsl:template name="math:median1">
  <xsl:param name="nodes" select="/.."/>
  <xsl:variable name="count" select="count($nodes)"/>
  <xsl:variable name="middle1" select="floor(($count + 1) div 2)"/>
  <xsl:variable name="middle2" select="ceiling(($count + 1) div 2)"/>

  <xsl:variable name="m1">
    <xsl:for-each select="$nodes">
      <xsl:sort data-type="number"/>
      <xsl:if test="position() = $middle1">
        <xsl:value-of select="."/>
      </xsl:if>
    </xsl:for-each>
  </xsl:variable>

  <xsl:variable name="m2">
    <xsl:choose>
      <xsl:when test="$middle1 = $middle2">
        <xsl:value-of select="$m1"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:for-each select="$nodes">
          <xsl:sort data-type="number"/>
          <xsl:if test="position() = $middle2">
            <xsl:value-of select="."/>
          </xsl:if>
        </xsl:for-each>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:variable>
  
  <!-- The median -->
  <xsl:value-of select="($m1 + $m2) div 2"/>
 </xsl:template>

<xsl:template name="math:median2">
  <xsl:param name="nodes" select="/.."/>
  <xsl:variable name="count" select="count($nodes)"/>
  <xsl:variable name="middle" select="ceiling($count div 2)"/>
  <xsl:variable name="even" select="not($count mod 2)"/>


  <xsl:variable name="m1">
    <xsl:for-each select="$nodes">
      <xsl:sort data-type="number"/>
      <xsl:if test="position() = $middle">
        <xsl:value-of select=". + ($even * ./following-sibling::*[1])"/>
      </xsl:if>
    </xsl:for-each>
  </xsl:variable>

  <!-- The median -->
  <xsl:value-of select="$m1 div ($even + 1)"/>
 </xsl:template>


<xsl:template match="/">

  <xsl:text>&#xa;</xsl:text>
  <xsl:call-template name="math:median1">
    <xsl:with-param name="nodes" select="*/*"/>
  </xsl:call-template>
  <xsl:text>&#xa;</xsl:text>
  <xsl:call-template name="math:median2">
    <xsl:with-param name="nodes" select="*/*"/>
  </xsl:call-template>
  
</xsl:template>

</xsl:stylesheet>
Mode

The element in a set that appears the most often is called the mode, and need not be numerical (the set can be of any type).

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:math="http://www.ora.com/XSLTCookbook/math" xmlns:test="test">

<xsl:output method="text"/>

<xsl:template name="math:mode">
  <xsl:param name="nodes" select="/.."/>
  <xsl:param name="max" select="0"/>
  <xsl:param name="mode" select="/.."/>

  <xsl:choose>
    <xsl:when test="not($nodes)">
      <xsl:copy-of select="$mode"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:variable name="first" select="$nodes[1]"/>
	<xsl:variable name="try" select="$nodes[. = $first]"/>
      <xsl:variable name="count" select="count($try)"/>
      <!-- Recurse with nodes not equal to first -->
      <xsl:call-template name="math:mode">
        <xsl:with-param name="nodes" select="$nodes[not(. = $first)]"/>
        <!-- If we have found a node that is more frequent then 
		pass the count otherwise pass the old max count -->
        <xsl:with-param name="max" 
		select="($count > $max) * $count + not($count > $max) * $max"/>
        <!-- Compute the new mode as ... -->
        <xsl:with-param name="mode">
          <xsl:choose>
            <!-- the first element in try if we found a new max -->
            <xsl:when test="$count > $max">
		   <xsl:copy-of select="$try[1]"/>
            </xsl:when>
            <!-- the old mode union the first element in try if we 
			found an equivalent count to current max -->
            <xsl:when test="$count = $max">
            <xsl:message>trouble?</xsl:message>
		   <xsl:copy-of select="$mode | $try[1]"/>
            </xsl:when>
            <!-- othewise the old mode stays the same -->
            <xsl:otherwise>
              <xsl:copy-of select="$mode"/>
            </xsl:otherwise>
          </xsl:choose>
        </xsl:with-param>
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>  
</xsl:template>

<test:data>1</test:data>
<test:data>2</test:data>
<test:data>1</test:data>
<test:data>1</test:data>
<test:data>1</test:data>
<test:data>1</test:data>
<test:data>1</test:data>
<test:data>2</test:data>
<test:data>2</test:data>
<test:data>2</test:data>
<test:data>2</test:data>
<test:data>2</test:data>
<test:data>2</test:data>
<test:data>2</test:data>

<xsl:template match="/">

  <xsl:text>&#xa;</xsl:text>
  <xsl:call-template name="math:mode">
    <xsl:with-param name="nodes" select="*/*"/>
  </xsl:call-template>
  
  <xsl:text>&#xa;</xsl:text>
  <xsl:text>&#xa;</xsl:text>
  <xsl:text>&#xa;</xsl:text>
  <xsl:call-template name="math:mode">
    <xsl:with-param name="nodes" select="document('')/*/test:data"/>
  </xsl:call-template>
  
</xsl:template>


</xsl:stylesheet>

In-Class Exercise

Create an XSLT template with the name "math:mean" which returns ( the sum of the numbers in a node set ) divided by ( the count of the numbers in that node set ). If you get a solution to work, please show the instructor.

You can download a simple data file for testing purposes from the course website.


revalidate XHTML Revalidate CSS Section 508 testing

Last modified: 9 Mar 2009 11:02:47 AM