{"id":2935,"date":"2008-09-08T19:04:22","date_gmt":"2008-09-08T18:04:22","guid":{"rendered":"http:\/\/jelmer.jteam.nl\/2008\/09\/08\/disabling-url-rewriting-for-the-googlebot\/"},"modified":"2008-09-08T19:04:22","modified_gmt":"2008-09-08T18:04:22","slug":"disabling-url-rewriting-for-the-googlebot","status":"publish","type":"post","link":"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/","title":{"rendered":"Disabling URL rewriting for the Googlebot"},"content":{"rendered":"<p>Http is a stateless protocol. To work around the problems caused by this, web applications have the concept of a session. When a user requests a webpage for the first time the user is assigned a unique 32 character string. This string can be send along in subsequent requests to indicate that these requests are in fact originating from the same user. The most common way to pass along this string, or session identifier, is by sending it in a cookie. But what if a user chooses to disable cookies ? In that case a servlet container will fall back on url rewriting, the session identifier is appended at the end of any links in your application. So a link to your homepage might look like this after rewriting<\/p>\n<p>\/index.html;jsessionid=AA922A8B781AC4F95E68F88B0AF8CCB3<\/p>\n<p>When you click this link the container will parse the jsessionid value and will determine that you are the same user that made the previous request. This way even privacy conscious users may continue to use your web site. This is something that all just works  as long as you use something like the jstl url tag. When it detects that the user has disabled cookies it will automatically start rewriting all the URLs in your application.<\/p>\n<p>Most of the time this is what you want. However there is an unfortunate side affect to this strategy. The Google bot that constantly spiders the internet for new content does not support cookies. This means that it will see, and index, the rewritten URLs. a <a href=\"http:\/\/www.google.nl\/search?hl=nl&amp;client=firefox-a&amp;rls=org.mozilla%3Aen-US%3Aofficial&amp;hs=99N&amp;q=inurl%3Ajsessionid&amp;btnG=Zoeken&amp;meta=\">quick search<\/a> suggests that this is a fairly common problem.  The rewritten URLs will hurt your google rating because  less of the URL will match a users search query. So how do you solve it ? It turned out to be fairly trivial.<\/p>\n<p>I created a ServletResponseWrapper that modifies the encodeURL and encodeRedirectURL methods so it does not append the session identifier. The wrapper is created in a servlet filter that only applies the wrapper when it determines that the request originates from the Google bot. You can check this fairy easily by inspecting the user agent header send along with every request. I included the source below<\/p>\n<p><a href=\"http:\/\/jelmer.jteam.nl\/wp-content\/upload\/SeoResponsWrapper.java.html\">SeoResponseWrapper.java<\/a><br \/>\n<a href=\"http:\/\/jelmer.jteam.nl\/wp-content\/upload\/SeoFilter.java.html\">SeoFilter.java<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Http is a stateless protocol. To work around the problems caused by this, web applications have the concept of a session. When a user requests a webpage for the first time the user is assigned a unique 32 character string. This string can be send along in subsequent requests to indicate that these requests are [&hellip;]<\/p>\n","protected":false},"author":58,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[31,124],"tags":[],"class_list":["post-2935","post","type-post","status-publish","format-standard","hentry","category-java","category-system-administration"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Disabling URL rewriting for the Googlebot - Trifork Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Disabling URL rewriting for the Googlebot - Trifork Blog\" \/>\n<meta property=\"og:description\" content=\"Http is a stateless protocol. To work around the problems caused by this, web applications have the concept of a session. When a user requests a webpage for the first time the user is assigned a unique 32 character string. This string can be send along in subsequent requests to indicate that these requests are [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/\" \/>\n<meta property=\"og:site_name\" content=\"Trifork Blog\" \/>\n<meta property=\"article:published_time\" content=\"2008-09-08T18:04:22+00:00\" \/>\n<meta name=\"author\" content=\"Jelmer Kuperus\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Jelmer Kuperus\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/\",\"url\":\"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/\",\"name\":\"Disabling URL rewriting for the Googlebot - Trifork Blog\",\"isPartOf\":{\"@id\":\"https:\/\/trifork.nl\/blog\/#website\"},\"datePublished\":\"2008-09-08T18:04:22+00:00\",\"author\":{\"@id\":\"https:\/\/trifork.nl\/blog\/#\/schema\/person\/c0ee9f25744015bf661fee1b797341f2\"},\"breadcrumb\":{\"@id\":\"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/trifork.nl\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Disabling URL rewriting for the Googlebot\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/trifork.nl\/blog\/#website\",\"url\":\"https:\/\/trifork.nl\/blog\/\",\"name\":\"Trifork Blog\",\"description\":\"Keep updated on the technical solutions Trifork is working on!\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/trifork.nl\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/trifork.nl\/blog\/#\/schema\/person\/c0ee9f25744015bf661fee1b797341f2\",\"name\":\"Jelmer Kuperus\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/trifork.nl\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/fff87cf8073c776ffcbe26326f713998?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/fff87cf8073c776ffcbe26326f713998?s=96&d=mm&r=g\",\"caption\":\"Jelmer Kuperus\"},\"sameAs\":[\"http:\/\/www.dutchworks.nl\"],\"url\":\"https:\/\/trifork.nl\/blog\/author\/jelmer\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Disabling URL rewriting for the Googlebot - Trifork Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/","og_locale":"en_US","og_type":"article","og_title":"Disabling URL rewriting for the Googlebot - Trifork Blog","og_description":"Http is a stateless protocol. To work around the problems caused by this, web applications have the concept of a session. When a user requests a webpage for the first time the user is assigned a unique 32 character string. This string can be send along in subsequent requests to indicate that these requests are [&hellip;]","og_url":"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/","og_site_name":"Trifork Blog","article_published_time":"2008-09-08T18:04:22+00:00","author":"Jelmer Kuperus","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Jelmer Kuperus","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/","url":"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/","name":"Disabling URL rewriting for the Googlebot - Trifork Blog","isPartOf":{"@id":"https:\/\/trifork.nl\/blog\/#website"},"datePublished":"2008-09-08T18:04:22+00:00","author":{"@id":"https:\/\/trifork.nl\/blog\/#\/schema\/person\/c0ee9f25744015bf661fee1b797341f2"},"breadcrumb":{"@id":"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/trifork.nl\/blog\/disabling-url-rewriting-for-the-googlebot\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/trifork.nl\/blog\/"},{"@type":"ListItem","position":2,"name":"Disabling URL rewriting for the Googlebot"}]},{"@type":"WebSite","@id":"https:\/\/trifork.nl\/blog\/#website","url":"https:\/\/trifork.nl\/blog\/","name":"Trifork Blog","description":"Keep updated on the technical solutions Trifork is working on!","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/trifork.nl\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/trifork.nl\/blog\/#\/schema\/person\/c0ee9f25744015bf661fee1b797341f2","name":"Jelmer Kuperus","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/trifork.nl\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/fff87cf8073c776ffcbe26326f713998?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/fff87cf8073c776ffcbe26326f713998?s=96&d=mm&r=g","caption":"Jelmer Kuperus"},"sameAs":["http:\/\/www.dutchworks.nl"],"url":"https:\/\/trifork.nl\/blog\/author\/jelmer\/"}]}},"_links":{"self":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/posts\/2935","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/users\/58"}],"replies":[{"embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/comments?post=2935"}],"version-history":[{"count":0,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/posts\/2935\/revisions"}],"wp:attachment":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/media?parent=2935"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/categories?post=2935"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/tags?post=2935"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}