How the SiteMinder Webagent encode & decode URLs

Document ID : KB000011150
Last Modified Date : 14/02/2018
Show Technical Document Details
Introduction:

What is the algorithm used by the Webagent to encode / decode URL

Question:

What is the algorithm used by the Webagent to encode / decode URL?

Environment:
All WebAgent Versions
Answer:

WebAgent encode an URL: 

=================

If the URL contains any of the following characters, webagent will Encode it:

' '(space), '&', '+', '?', '%', or '$'.  

First, the URL is prepended with '$SM$'. 

Next, the following rules are applied in order:

' 'is replaced with '%20'       

'&' is replaced with '%26'  

'+' is replaced with '%2b'  

'?' is replaced with '%3f'     

'@' is replaced with '%40' 

'"' is replaced with '"' (no changes/encoding)  

'=' is replaced with '%3d'  

'%' is replaced with '$%' or '-%' 

 

Case of '$' and '-' 

> '$' is used as delimiter for traditional agents / LegacyEncoding = YES

'$' is replaced with '$$'

'$$' is replaced with '$$$$' 

'$$$' is replaced with '$$$$$$'

'-' is replaced with '-' (no changes/encoding)

When decoding, it will reverse the logic, agent will remove single $ sign for every one it added before.

 

> '-' is used as delimiter for framework agents / LegacyEncoding = NO

'-' is replaced with '--'

'--' is replaced with '----'

'---' is replaced with '------'

'$' is replaced with '%24'

When decoding, it will reverse the logic, agent will remove single hyphen sign for every one it added before.

 

Scenario with traditional agents / LegacyEncoding = YES

URL being encoded is:

http://server.domain.com/resource?P1=A+B&P2=Space%20Here

SM-Encoded, it becomes:

$SM$http%3a%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20Here

WebAgent decode an URL: 

=================

If the URL starts with '$SM$', then scan the string from the beginning. If the current character is '$', skip to the next character and return it. If the current character is %, then read the next TWO characters and return the urldecoded value. Otherwise return the current character. The algorithm will *not* urldecode a value such as $%20, 

because the % will have been skipped by the first case.

So, if the URL being decoded is: 

$SM$http%3a%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20

Here first strip off the $SM$: 

http%3a%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20

then parse down the string until we find a '$' or a '%':

 

http%3a%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20

At this point, we see a %. So, we urldecode the % and the next two characters and then continue: 

http:%2f%2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20Here

 

Again, we see a %. Repeat:

http:/ %2fserver%2edomain%2ecom%2fresource%3fP1=A%2bB%26P2=Space$%20Here

Repeat (x times)

http://server.domain.com/resource?P1=A+B&P2=Space

Now, we see a '$' character, that means we return the *next* character and continue scanning.

 

http://server.domain.com/resource?P1=A+B&P2=Space%20Here 

And now we've reached the end of the string. This is the SM-Decoded value.

 

Scenario with Framework agents / LegacyEncoding = NO 

http://server.domain.com/protected/HeaderDumper.asp?1%202&3+4?5%6$7@8"9=10-11--12---13

becomes 

SM-HTTP%3a%2f%2fserver%2edomain%2ecom%2fprotected%2fHeaderDumper%2easp%3f1-%202%263%2b4%3f5-%6%247%408"9%3d10--11----12------13

 

Are you running a Traditional Web Agents or a Framework Agents ? 

Traditional Web Agents are installed on the following web servers:

IIS 5.0  

Apache 1.x  

Apache 1.x-based servers: IBM HTTP Server, Covalent Fast Start 2.x, and Oracle 9.x server 

Domino

 

Framework Agents are installed on the following web servers: 

IIS 6.0 

Apache 2.0  

Apache 2.0-based servers: IBM HTTP Server, Covalent ERS 2.x, HP Apache, and Oracle 10.x HTTP server  

Sun Java Systems 6.0 and 6.1

Note: The Sun Java System Web server was formerly called the Sun ONE Web server or the iPlanet Web server.