{"id":6401,"date":"2011-11-16T09:57:04","date_gmt":"2011-11-16T08:57:04","guid":{"rendered":"http:\/\/blog.trifork.nl\/?p=6401"},"modified":"2011-11-16T09:57:04","modified_gmt":"2011-11-16T08:57:04","slug":"apache-lucene-flexiblescoring-with-indexdocvalues","status":"publish","type":"post","link":"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/","title":{"rendered":"Apache Lucene FlexibleScoring with IndexDocValues"},"content":{"rendered":"<p>During\u00a0<a href=\"http:\/\/code.google.com\/soc\/\">GoogleSummerOfCode<\/a> 2011\u00a0David Nemeskey, PhD student, <a href=\"http:\/\/wiki.apache.org\/lucene-java\/SummerOfCode2011ProjectRanking\">proposed<\/a> to improve <a href=\"http:\/\/lucene.apache.org\">Lucene\u2019s<\/a> scoring architecture and implement some state-of-the-art ranking models with the new framework. Prior to this and in all Lucene versions released so far the <a href=\"http:\/\/en.wikipedia.org\/wiki\/Vector_space_model\">Vector-Space Model <\/a>was tightly bound into Lucene. If you found yourself in a situation where another scoring model worked better for your usecase you basically had two choices; you either override all existing Scorers in Queries and implement your own model provided you have all the statistics available or you switch to some other search engine providing alternative models or extension points.<\/p>\n<p>With Lucene 4.0 this is history! David Nemeskey\u00a0and Robert Muir added an extensible API as well as index based statistics like Sum of Total Term Frequency or Sum of Document Frequency per Field\u00a0to provide multiple scoring models. Lucene 4.0 comes with:<\/p>\n<ul>\n<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Vector_space_model\">TF\/IDF Vector-Space Model<\/a><\/li>\n<li><a href=\"http:\/\/theses.gla.ac.uk\/1570\/\">Divergence from Randomness<\/a><\/li>\n<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Language_model\">Language Models\u00a0<\/a><\/li>\n<li><a href=\"http:\/\/dl.acm.org\/citation.cfm?id=1835490\">Information Based Models<\/a><\/li>\n<li><a href=\"http:\/\/en.wikipedia.org\/wiki\/Okapi_BM25\">Okapi BM25<\/a><\/li>\n<\/ul>\n<p>Lucene&#8217;s central scoring class <a href=\"https:\/\/builds.apache.org\/job\/Lucene-trunk\/javadoc\/core\/org\/apache\/lucene\/search\/similarities\/Similarity.html\">Similarity<\/a>\u00a0has been extended to return dedicated Scorers like <a href=\"https:\/\/builds.apache.org\/job\/Lucene-trunk\/javadoc\/core\/org\/apache\/lucene\/search\/similarities\/Similarity.ExactDocScorer.html\">ExactDocScorer<\/a> and <a href=\"https:\/\/builds.apache.org\/job\/Lucene-trunk\/javadoc\/core\/org\/apache\/lucene\/search\/similarities\/Similarity.SloppyDocScorer.html\">SloppyDocScorer<\/a>\u00a0to calculate the actual score. This refactoring basically moved the actual score calculation out of the QueryScorer into a Similarity to allow implementing alternative scoring within a single method.\u00a0Lucene 4.0 also comes with a new SimilarityProvider which lets you define a Similarity per field. Each field could use a slightly different similarity or incorporate additional scoring factors like <a href=\"http:\/\/blog.trifork.nl\/2011\/10\/27\/introducing-lucene-index-doc-values\/\">IndexDocValues<\/a>.<\/p>\n<h3>Boosting Similarity with IndexDocValues<\/h3>\n<p>Now that we have a selection of scoring models and the freedom to extend them we can tailor the scoring function exactly to our needs. Let&#8217;s look at a specific usecase &#8211; custom boosting. Imagine you indexed websites and calculated a pagerank but Lucene&#8217;s index-time boosting mechanism is not flexible enough for you, you could use IndexDocValues to store the page rank. First of all you need to get your data into Lucene ie. store the PageRank into a IndexDocValues field, <a href=\"#f1\">Figure 1.<\/a> shows an example.<\/p>\n<pre><\/pre>\n<p><a name=\"f1\"><\/a><\/p>\n<pre>IndexWriter writer = ...;\nfloat pageRank = ...;\nDocument doc = new Document();\n<span style=\"color: #3f7f59\">\/\/ add a standalone IndexDocValues field<\/span>\nIndexDocValuesField valuesField = new IndexDocValuesField(\"pageRank\");\nvaluesField.setFloat(pageRank);\ndoc.add(valuesField);\ndoc.add(...); <span style=\"color: #3f7f59\">\/\/ add your title etc.<\/span>\nwriter.addDocument(doc);\nwriter.commit();<\/pre>\n<div class=\"portlet-msg-info\">Figure 1. Adding custom boost \/ score values as IndexDocValues<\/div>\n<p>Once we have indexed our documents we can proceed to implement our Custom Similarity to incorporate the page rank into the document score. However, most of us won&#8217;t be in the situation that we can or want to come up with a entirely new scoring model so we are likely using one of the already existing scoring models available in Lucene. But even if we are not entirely sure which one we going to be using eventually we can already implement the PageRankSimilarity. (see <a href=\"#f2\">Figure 2<\/a>.)<\/p>\n<p><span class=\"Apple-style-span\" style=\"font-family: 'Courier New', Courier, monospace\"><span style=\"color: #7f0055;font-weight: bold\"><a name=\"f2\"><\/a>public<\/span> <span style=\"color: #7f0055;font-weight: bold\">class<\/span> PageRankSimilarity <span style=\"color: #7f0055;font-weight: bold\">extends<\/span> Similarity {<\/span><\/p>\n<pre><span style=\"color: #7f0055;font-weight: bold\">private<\/span> <span style=\"color: #7f0055;font-weight: bold\">final<\/span> Similarity sim;\n\n  <span style=\"color: #7f0055;font-weight: bold\">public<\/span> PageRankSimilarity(Similarity sim) {\n    <span style=\"color: #7f0055;font-weight: bold\">this<\/span>.sim = sim; <span style=\"color: #3f7f59\">\/\/ wrap another similarity<\/span>\n  }\n\n  @Override\n  <span style=\"color: #7f0055;font-weight: bold\">public<\/span> ExactDocScorer exactDocScorer(Stats stats, <span style=\"color: #7f0055;font-weight: bold\">String<\/span> fieldName,\n      AtomicReaderContext context) <span style=\"color: #7f0055;font-weight: bold\">throws<\/span> <span style=\"color: #7f0055;font-weight: bold\">IOException<\/span> {\n    <span style=\"color: #7f0055;font-weight: bold\">final<\/span> ExactDocScorer sub = sim.exactDocScorer(stats, fieldName, context);\n    <span style=\"color: #3f7f59\">\/\/ simply pull a IndexDocValues Source for the pageRank field<\/span>\n    <span style=\"color: #7f0055;font-weight: bold\">final<\/span> Source values = context.reader.docValues(<span style=\"color: #2a00ff\">\"pageRank\"<\/span>).getSource();\n\n    <span style=\"color: #7f0055;font-weight: bold\">return<\/span> <span style=\"color: #7f0055;font-weight: bold\">new<\/span> ExactDocScorer() {\n      @Override\n      <span style=\"color: #7f0055;font-weight: bold\">public<\/span> <span style=\"color: #7f0055;font-weight: bold\">float<\/span> score(<span style=\"color: #7f0055;font-weight: bold\">int<\/span> doc, <span style=\"color: #7f0055;font-weight: bold\">int<\/span> freq) {\n        <span style=\"color: #3f7f59\">\/\/ multiply the pagerank into your score<\/span>\n        <span style=\"color: #7f0055;font-weight: bold\">return<\/span> (<span style=\"color: #7f0055;font-weight: bold\">float<\/span>) values.getFloat(doc) * sub.score(doc, freq);\n      }\n      @Override\n      <span style=\"color: #7f0055;font-weight: bold\">public<\/span> Explanation explain(<span style=\"color: #7f0055;font-weight: bold\">int<\/span> doc, Explanation freq) {\n        <span style=\"color: #3f7f59\">\/\/ implement explain here<\/span>\n      }\n    };\n  }\n  @Override\n  <span style=\"color: #7f0055;font-weight: bold\">public<\/span> <span style=\"color: #7f0055;font-weight: bold\">byte<\/span> computeNorm(FieldInvertState state) {\n    <span style=\"color: #7f0055;font-weight: bold\">return<\/span> sim.computeNorm(state);\n  }\n\n  @Override\n  <span style=\"color: #7f0055;font-weight: bold\">public<\/span> Stats computeStats(CollectionStatistics collectionStats,\n                <span style=\"color: #7f0055;font-weight: bold\">float<\/span> queryBoost,TermStatistics... termStats) {\n    <span style=\"color: #7f0055;font-weight: bold\">return<\/span> sim.computeStats(collectionStats, queryBoost, termStats);\n  }\n}<\/pre>\n<div class=\"portlet-msg-info\">Figure 2. Custom Similarity delegate using IndexDocValues<\/div>\n<p>With most calls delegated to some other Similarity of your choice, boosting documents by PageRank is as simple as it gets. All you need to do is to pull a Source from the IndexReader passed in via AtomicReaderContext (Atomic in this context means is a leave reader in the Lucene IndexReader hierarchy also referred to as a SegmentReader). The <a href=\"https:\/\/builds.apache.org\/job\/Lucene-trunk\/javadoc\/core\/org\/apache\/lucene\/index\/values\/IndexDocValues.html#getSource%28%29\">IndexDocValues#getSource() <\/a>method will load the values for this field atomically on the first request and buffer them in memory until the reader goes out of scope (or until you manually unload them, I might cover that in a different post). Make sure you don&#8217;t use\u00a0<a href=\"https:\/\/builds.apache.org\/job\/Lucene-trunk\/javadoc\/core\/org\/apache\/lucene\/index\/values\/IndexDocValues.html#load%28%29\">IndexDocValues#load()<\/a> which will pull in the values for each invocation.<\/p>\n<h3>Can I use this in Apache Solr?<\/h3>\n<p><a href=\"http:\/\/lucene.apache.org\/solr\">Apache Solr<\/a>\u00a0lets you already define\u00a0custom similarities in its <em>schema.xml <\/em>file. Inside the &lt;type&gt; section you can define a custom similarity per &lt;fieldType&gt; as show in <a href=\"#f3\">Figure 3<\/a> below.<\/p>\n<pre><\/pre>\n<p><span style=\"color: #7f0055\"><a name=\"f3\"><\/a><\/span><\/p>\n<pre><span style=\"color: #7f0055\">&lt;<\/span><span style=\"color: #7f0055\">fieldType<\/span> name=<span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #2a00ff\">text<\/span><span style=\"color: #2a00ff\">\"<\/span> class=<span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #2a00ff\">solr.TextField<\/span><span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #7f0055\">&gt;<\/span>\n  <span style=\"color: #7f0055\">&lt;<\/span><span style=\"color: #7f0055\">analyzer<\/span> class=<span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #2a00ff\">org.apache.lucene.analysis.standard.StandardAnalyzer<\/span><span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #7f0055\">\/&gt;<\/span>\n  <span style=\"color: #7f0055\">&lt;<\/span><span style=\"color: #7f0055\">similarity<\/span> class=<span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #2a00ff\">solr.BM25SimilarityFactory<\/span><span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #7f0055\">&gt;<\/span>\n    <span style=\"color: #7f0055\">&lt;<\/span><span style=\"color: #7f0055\">float<\/span> name=<span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #2a00ff\">k1<\/span><span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #7f0055\">&gt;<\/span>1.2<span style=\"color: #7f0055\">&lt;\/<\/span><span style=\"color: #7f0055\">float<\/span><span style=\"color: #7f0055\">&gt;<\/span>\n    <span style=\"color: #7f0055\">&lt;<\/span><span style=\"color: #7f0055\">float<\/span> name=<span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #2a00ff\">b<\/span><span style=\"color: #2a00ff\">\"<\/span><span style=\"color: #7f0055\">&gt;<\/span>0.76<span style=\"color: #7f0055\">&lt;\/<\/span><span style=\"color: #7f0055\">float<\/span><span style=\"color: #7f0055\">&gt;<\/span>\n  <span style=\"color: #7f0055\">&lt;\/<\/span><span style=\"color: #7f0055\">similarity<\/span><span style=\"color: #7f0055\">&gt;<\/span>\n<span style=\"color: #7f0055\">&lt;\/<\/span><span style=\"color: #7f0055\">fieldType<\/span><span style=\"color: #7f0055\">&gt;<\/span><\/pre>\n<div class=\"portlet-msg-info\">Figure 3. Using BM25 Scoring Model in Solr<\/div>\n<p>Unfortunately, IndexDocValues are not yet exposed in Solr. There is an <a href=\"https:\/\/issues.apache.org\/jira\/browse\/SOLR-2753\">issue<\/a> open aiming to add support for it without any progress yet. If you feel like you can benefit from IndexDocValues and all its features and you want to get involved into Apache Lucene &amp; Solr feel free to comment on the issue. I&#8217;d be delighted to help you working towards IndexDocValues support in Solr!<\/p>\n<h3>What is next?<\/h3>\n<p>I didn&#8217;t decide on what is next in this series of posts but its likely yet another use case for IndexDocValues like Grouping and Sorting or we are going to look closer into how IndexDocValues are integrated into Lucene&#8217;s Flexible Indexing.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>During\u00a0GoogleSummerOfCode 2011\u00a0David Nemeskey, PhD student, proposed to improve Lucene\u2019s scoring architecture and implement some state-of-the-art ranking models with the new framework. Prior to this and in all Lucene versions released so far the Vector-Space Model was tightly bound into Lucene. If you found yourself in a situation where another scoring model worked better for your [&hellip;]<\/p>\n","protected":false},"author":107,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[15,65],"tags":[35,33,16,11],"class_list":["post-6401","post","type-post","status-publish","format-standard","hentry","category-enterprise-search","category-big_data_search","tag-lucene","tag-solr","tag-enterprise-search","tag-java"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Apache Lucene FlexibleScoring with IndexDocValues - Trifork Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Apache Lucene FlexibleScoring with IndexDocValues - Trifork Blog\" \/>\n<meta property=\"og:description\" content=\"During\u00a0GoogleSummerOfCode 2011\u00a0David Nemeskey, PhD student, proposed to improve Lucene\u2019s scoring architecture and implement some state-of-the-art ranking models with the new framework. Prior to this and in all Lucene versions released so far the Vector-Space Model was tightly bound into Lucene. If you found yourself in a situation where another scoring model worked better for your [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/\" \/>\n<meta property=\"og:site_name\" content=\"Trifork Blog\" \/>\n<meta property=\"article:published_time\" content=\"2011-11-16T08:57:04+00:00\" \/>\n<meta name=\"author\" content=\"Simon Willnauer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Simon Willnauer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/\",\"url\":\"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/\",\"name\":\"Apache Lucene FlexibleScoring with IndexDocValues - Trifork Blog\",\"isPartOf\":{\"@id\":\"https:\/\/trifork.nl\/blog\/#website\"},\"datePublished\":\"2011-11-16T08:57:04+00:00\",\"author\":{\"@id\":\"https:\/\/trifork.nl\/blog\/#\/schema\/person\/88be6f0de12503d08f3d5f18796e4051\"},\"breadcrumb\":{\"@id\":\"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/trifork.nl\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Apache Lucene FlexibleScoring with IndexDocValues\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/trifork.nl\/blog\/#website\",\"url\":\"https:\/\/trifork.nl\/blog\/\",\"name\":\"Trifork Blog\",\"description\":\"Keep updated on the technical solutions Trifork is working on!\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/trifork.nl\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/trifork.nl\/blog\/#\/schema\/person\/88be6f0de12503d08f3d5f18796e4051\",\"name\":\"Simon Willnauer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/trifork.nl\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/254a556e9dde04a2d02ed76e5971a0fd?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/254a556e9dde04a2d02ed76e5971a0fd?s=96&d=mm&r=g\",\"caption\":\"Simon Willnauer\"},\"description\":\"I am a Apache Lucene PMC and core committer and work mainly on scalable distributed information retrieval systems as well as the Lucene core engine. I'm also a co-organizer of BerlinBuzzwords (http:\/\/www.berlinbuzzwords.de) an annual conference on Scalability Berlin.\",\"sameAs\":[\"http:\/\/www.jteam.nl\"],\"url\":\"https:\/\/trifork.nl\/blog\/author\/simonw\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Apache Lucene FlexibleScoring with IndexDocValues - Trifork Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/","og_locale":"en_US","og_type":"article","og_title":"Apache Lucene FlexibleScoring with IndexDocValues - Trifork Blog","og_description":"During\u00a0GoogleSummerOfCode 2011\u00a0David Nemeskey, PhD student, proposed to improve Lucene\u2019s scoring architecture and implement some state-of-the-art ranking models with the new framework. Prior to this and in all Lucene versions released so far the Vector-Space Model was tightly bound into Lucene. If you found yourself in a situation where another scoring model worked better for your [&hellip;]","og_url":"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/","og_site_name":"Trifork Blog","article_published_time":"2011-11-16T08:57:04+00:00","author":"Simon Willnauer","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Simon Willnauer","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/","url":"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/","name":"Apache Lucene FlexibleScoring with IndexDocValues - Trifork Blog","isPartOf":{"@id":"https:\/\/trifork.nl\/blog\/#website"},"datePublished":"2011-11-16T08:57:04+00:00","author":{"@id":"https:\/\/trifork.nl\/blog\/#\/schema\/person\/88be6f0de12503d08f3d5f18796e4051"},"breadcrumb":{"@id":"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/trifork.nl\/blog\/apache-lucene-flexiblescoring-with-indexdocvalues\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/trifork.nl\/blog\/"},{"@type":"ListItem","position":2,"name":"Apache Lucene FlexibleScoring with IndexDocValues"}]},{"@type":"WebSite","@id":"https:\/\/trifork.nl\/blog\/#website","url":"https:\/\/trifork.nl\/blog\/","name":"Trifork Blog","description":"Keep updated on the technical solutions Trifork is working on!","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/trifork.nl\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/trifork.nl\/blog\/#\/schema\/person\/88be6f0de12503d08f3d5f18796e4051","name":"Simon Willnauer","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/trifork.nl\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/254a556e9dde04a2d02ed76e5971a0fd?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/254a556e9dde04a2d02ed76e5971a0fd?s=96&d=mm&r=g","caption":"Simon Willnauer"},"description":"I am a Apache Lucene PMC and core committer and work mainly on scalable distributed information retrieval systems as well as the Lucene core engine. I'm also a co-organizer of BerlinBuzzwords (http:\/\/www.berlinbuzzwords.de) an annual conference on Scalability Berlin.","sameAs":["http:\/\/www.jteam.nl"],"url":"https:\/\/trifork.nl\/blog\/author\/simonw\/"}]}},"_links":{"self":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/posts\/6401","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/users\/107"}],"replies":[{"embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/comments?post=6401"}],"version-history":[{"count":0,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/posts\/6401\/revisions"}],"wp:attachment":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/media?parent=6401"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/categories?post=6401"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/tags?post=6401"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}