{"id":3106,"date":"2011-04-01T15:59:48","date_gmt":"2011-04-01T13:59:48","guid":{"rendered":"http:\/\/blog.jteam.nl\/?p=3106"},"modified":"2011-04-01T15:59:48","modified_gmt":"2011-04-01T13:59:48","slug":"gimme-all-resources-you-have-i-can-use-them","status":"publish","type":"post","link":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/","title":{"rendered":"Gimme all resources you have &#8211; I can use them!"},"content":{"rendered":"<h3 id=\"internal-source-marker_0.18640648579341834\">Exploiting full IO and CPU concurrency when indexing with Apache Lucene<\/h3>\n<p>During the last year Apache Lucene has been improved an extreme amount with outstanding improvements such as <a href=\"http:\/\/blog.mikemccandless.com\/2011\/03\/lucenes-fuzzyquery-is-100-times-faster.html\">100 times faster FuzzyQueries<\/a>, new <a href=\"http:\/\/blog.mikemccandless.com\/2010\/12\/using-finite-state-transducers-in.html\">Term-Dictionary implementation<\/a>, enhanced <a href=\"http:\/\/blog.mikemccandless.com\/2011\/02\/visualizing-lucenes-segment-merges.html\">Segment-Merging<\/a> and the famous <a href=\"http:\/\/blog.mikemccandless.com\/2010\/10\/fun-with-flexible-indexing.html\">Flexible-Indexing<\/a> API. Recently I started working on another fundamental change referred to as <a href=\"https:\/\/issues.apache.org\/jira\/browse\/LUCENE-2324\">DocumentsWriterPerThread<\/a>,  an extensive <tt>IndexWriter<\/tt> refactoring: Code that defines indexing  performance though most Lucene users never have contact with it  directly. Let me provide you with a brief introduction:<br \/>\n <!--more--><br \/>\n<figure id=\"attachment_3108\" aria-describedby=\"caption-attachment-3108\" style=\"width: 337px\" class=\"wp-caption alignleft\"><a href=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-3108 \" style=\"border: 1px solid black\" title=\"Fig. 1 Document Indexing on Lucene 4.0 trunk\" src=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png\" alt=\"Document Indexing on Lucene 4.0 trunk\" width=\"337\" height=\"350\" srcset=\"https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/trunk_indexing_small.png 337w, https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/trunk_indexing_small-289x300.png 289w\" sizes=\"auto, (max-width: 337px) 100vw, 337px\" \/><\/a><figcaption id=\"caption-attachment-3108\" class=\"wp-caption-text\">Fig. 1 Document Indexing on Lucene 4.0 trunk<\/figcaption><\/figure><\/p>\n<p>During  indexing Lucene builds an in-memory index before it flushes the index  structures to a persistent storage. Internally, the indexer builds up  several smaller index segments in memory and merges them together once a  flush is needed. Those smaller index segments are built concurrently  provided the indexer is used by multiple threads in parallel. The <tt>IndexWriter<\/tt> uses a <tt>DocumentsWriter<\/tt> that selects a thread-private data structure \u00a0for an incoming indexing  thread which in turn inverts the document into the in memory segment.  (see Fig. 1)<\/p>\n<p>This  model allows full CPU concurrency until the <tt>IndexWriter<\/tt> must flush the  in-memory segments to the directory (a low level abstraction on top of  Java&#8217;s file system API). Once Lucene starts flushing we need to stop all  threads and wait until the flushing thread has finished writing the  segment to disk. This implementation in Lucene 3.0 essentially is a  stop-the-world model preventing any indexing thread from making progress  during a flush. Especially on slow IO systems or when indexing  tremendous amounts of data, this limitation can become a serious  bottleneck.<\/p>\n<p>The <tt>DocumentsWriterPerThread<\/tt> (DWPT) refactoring, currently developed in a <a href=\"http:\/\/svn.apache.org\/repos\/asf\/lucene\/dev\/branches\/realtime_search\/\">branch<\/a>,   tries to remove this limitation and exploit full CPU and IO  concurrency  during indexing. Instead of merging in memory data  structures during  flush and writing a single segment, each <tt>DocumentsWriterPerThread<\/tt> writes  its own private segments. This allows us to flush DWPT  concurrently without preventing concurrent indexing threads from making   progress.<\/p>\n<p><figure id=\"attachment_3107\" aria-describedby=\"caption-attachment-3107\" style=\"width: 350px\" class=\"wp-caption alignleft\"><a href=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/dwpt_indexing_small.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-3107 \" title=\"Fig. 2 Document Indexing with DocumentsWriterPerThread\" src=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/dwpt_indexing_small.png\" alt=\"Document Indexing with DocumentsWriterPerThread\" width=\"350\" height=\"295\" srcset=\"https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/dwpt_indexing_small.png 350w, https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/dwpt_indexing_small-300x253.png 300w\" sizes=\"auto, (max-width: 350px) 100vw, 350px\" \/><\/a><figcaption id=\"caption-attachment-3107\" class=\"wp-caption-text\">Fig. 2 Document Indexing with DocumentsWriterPerThread<\/figcaption><\/figure><\/p>\n<p>The  idea for this refactoring has been  around for a while initially brought  up by Michael Busch (Twitter) in a  Realtime-Search context and has been developed in a branch since June  2010. I recently worked together with Mike McCandless on adding one of  the <a href=\"https:\/\/issues.apache.org\/jira\/browse\/LUCENE-2573\">missing pieces<\/a> to start benchmarking and eventually merge with trunk. Earlier this  week I committed the a new <tt>FlushPolicy<\/tt> that controls when a  <tt>DocumentsWriterPerThread<\/tt> starts flushing its segment to disk. The global <tt>DocumentsWriter<\/tt> visits the policy for every change and tries to pull a  flush-pending  DWPT once the policy returns. If a DWPT must flush  <tt>DocumentsWriter<\/tt> swaps  the pending DWPT with a fresh one and starts  flushing the segment.<\/p>\n<p>For the initial version of Lucene&#8217;s default <tt>FlushPolicy<\/tt> we decided to  only mark the largest DWPT as pending once the global  active indexing  memory exceeds the configured RAM buffer size and take  the pending DWPT  out of the active indexing memory once marked as  pending.<br \/>\nWith  this model we guarantee that there are always enough DWPT available for indexing while at the same time IO resources are utilized  without  blocking indexing throughput. This sounds very exciting in theory but we haven&#8217;t seen it performing at all neither worse nor better so it is time to give it a  shot in an  experiment.<\/p>\n<h3 id=\"internal-source-marker_0.18640648579341834\">Benchmarking Lucene Indexing<\/h3>\n<p>Among Lucene devs we use a tool on Apache-Extras called <a href=\"http:\/\/code.google.com\/a\/apache-extras.org\/p\/luceneutil\/\">LuceneUtil<\/a> to run all kinds of search performance benchmarks to ensure that our   patches don&#8217;t have any negative impact on the search performance. To   efficiently benchmark the indexing process I added some statistics like   ingest rates (Documents \/ Second ) and flush throughput rate to show  how  the DWPT refactoring behaves.<\/p>\n<p>We traditionally use a <a href=\"http:\/\/en.wikipedia.org\/\">Wikipedia<\/a> English XML export which uncompresses to 21 GB of plain text. I pointed LuceneUtil to a clean <a href=\"http:\/\/svn.apache.org\/repos\/asf\/lucene\/dev\/trunk\/\">Lucene trunk<\/a> checkout as the base line and used the <a href=\"http:\/\/svn.apache.org\/repos\/asf\/lucene\/dev\/branches\/realtime_search\/\">Realtime-branch<\/a> as its competitor. \u00a0I kicked off a 10 M documents indexing run on a 2x 6 core Xeon Box with 24GB RAM and and a <a href=\"http:\/\/www.hitachigst.com\/internal-drives\/enterprise\/ultrastar\/ultrastar-a7k2000\">500GB Hitachi HDD<\/a> which blasted those 10M into a Lucene index in 13 min and 40 seconds, not bad!<\/p>\n<p><figure id=\"attachment_3120\" aria-describedby=\"caption-attachment-3120\" style=\"width: 350px\" class=\"wp-caption alignleft\"><a href=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/Trunk_dps_01_small.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-3120\" title=\"Fig. 3 Ingest rate on Lucene Trunk\" src=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/Trunk_dps_01_small.png\" alt=\"Fig. 3 Ingest rate on Lucene Trunk\" width=\"350\" height=\"263\" srcset=\"https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/Trunk_dps_01_small.png 350w, https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/Trunk_dps_01_small-300x225.png 300w\" sizes=\"auto, (max-width: 350px) 100vw, 350px\" \/><\/a><figcaption id=\"caption-attachment-3120\" class=\"wp-caption-text\">Fig. 3 Ingest rate on Lucene Trunk<\/figcaption><\/figure><\/p>\n<p>A   snapshot of trunk&#8217;s ingest rate between 50 seconds after start and 200   seconds shows a nice peak performance of about 40k documents per  second  on average and confirms the theory that we are making zero  progress  while we are flushing to disk. Flushing takes quite a bit of  time and  keeps your IO system busy.<\/p>\n<p>Yet,  I knew what to expect from trunk so the exciting benchmark run was   still left to do. As soon as I looked at the charts I knew that all  the  effort payed off and we are actually faster, but the real question  was  how good are we doing in terms of resource utilization. The run  returned  in 6 min and 15 seconds &#8211; WOW this is not even 50% of the time  compared  to trunk!<\/p>\n<p><figure id=\"attachment_3114\" aria-describedby=\"caption-attachment-3114\" style=\"width: 350px\" class=\"wp-caption alignleft\"><a href=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/DocumentsWriterPerThread_dps_01_small.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-3114\" title=\"Fig. 4 Ingest rate with DocumentsWriterPerThread\" src=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/DocumentsWriterPerThread_dps_01_small.png\" alt=\"Fig. 4 Ingest rate with DocumentsWriterPerThread\" width=\"350\" height=\"263\" srcset=\"https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/DocumentsWriterPerThread_dps_01_small.png 350w, https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/DocumentsWriterPerThread_dps_01_small-300x225.png 300w\" sizes=\"auto, (max-width: 350px) 100vw, 350px\" \/><\/a><figcaption id=\"caption-attachment-3114\" class=\"wp-caption-text\">Fig. 4 Ingest rate with DocumentsWriterPerThread<\/figcaption><\/figure><\/p>\n<p>These  are amazing results, we have threads constantly adding documents to the  index while flushes are running in the background. The peaks still seem  to be little lower than on trunk which is most likely due to a thread  flushing in the background since we simply swap DWPT once they must  flush and hijack the indexing thread to do the flush.<\/p>\n<p>Looking  at the flush-throughput (Fig. 5) shows that doing flushes concurrently  pays off very well in terms of IO utilization. DWPT is constantly using  IO with little overlaps which seems to be an indicator that the disk can  not keep up with the flushes. This also explains the differences in  documents \/ seconds in Figure 4, if flushes overlap hijacked indexing  threads can not make progress \u00a0so ingest rate drops.<\/p>\n<p><figure id=\"attachment_3115\" aria-describedby=\"caption-attachment-3115\" style=\"width: 350px\" class=\"wp-caption alignleft\"><a href=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/DocumentsWriterPerThread_flush_01_small.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-3115\" title=\"Fig. 5 Flushing rate with DocumentsWriterPerThread\" src=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/DocumentsWriterPerThread_flush_01_small.png\" alt=\"Fig. 5 Flushing rate with DocumentsWriterPerThread\" width=\"350\" height=\"263\" srcset=\"https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/DocumentsWriterPerThread_flush_01_small.png 350w, https:\/\/trifork.nl\/blog\/wp-content\/uploads\/sites\/3\/2011\/04\/DocumentsWriterPerThread_flush_01_small-300x225.png 300w\" sizes=\"auto, (max-width: 350px) 100vw, 350px\" \/><\/a><figcaption id=\"caption-attachment-3115\" class=\"wp-caption-text\">Fig. 5 Flushing rate with DocumentsWriterPerThread<\/figcaption><\/figure><\/p>\n<p>The  overall results for DWPT seem amazing and given the fact that we  haven&#8217;t really started optimizing it promises even further improvements  along those lines. That said, I&#8217;m curious how this change can improve indexing speed on <a title=\"Hadoop\" href=\"http:\/\/hadoop.apache.org\">Hadoop<\/a> together with our new <a href=\"https:\/\/issues.apache.org\/jira\/browse\/LUCENE-2373\">AppendingCodec <\/a>that allows to write to <a title=\"Hadoo Distributed File System\" href=\"http:\/\/hadoop.apache.org\/hdfs\/\">HDFS<\/a> directly. Concurrent flushing should provide good improvements here too. Stay tuned!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Exploiting full IO and CPU concurrency when indexing with Apache Lucene During the last year Apache Lucene has been improved an extreme amount with outstanding improvements such as 100 times faster FuzzyQueries, new Term-Dictionary implementation, enhanced Segment-Merging and the famous Flexible-Indexing API. Recently I started working on another fundamental change referred to as DocumentsWriterPerThread, an [&hellip;]<\/p>\n","protected":false},"author":107,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[15,31],"tags":[35,16],"class_list":["post-3106","post","type-post","status-publish","format-standard","hentry","category-enterprise-search","category-java","tag-lucene","tag-enterprise-search"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Gimme all resources you have - I can use them! - Trifork Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Gimme all resources you have - I can use them! - Trifork Blog\" \/>\n<meta property=\"og:description\" content=\"Exploiting full IO and CPU concurrency when indexing with Apache Lucene During the last year Apache Lucene has been improved an extreme amount with outstanding improvements such as 100 times faster FuzzyQueries, new Term-Dictionary implementation, enhanced Segment-Merging and the famous Flexible-Indexing API. Recently I started working on another fundamental change referred to as DocumentsWriterPerThread, an [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/\" \/>\n<meta property=\"og:site_name\" content=\"Trifork Blog\" \/>\n<meta property=\"article:published_time\" content=\"2011-04-01T13:59:48+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png\" \/>\n<meta name=\"author\" content=\"Simon Willnauer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Simon Willnauer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/\",\"url\":\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/\",\"name\":\"Gimme all resources you have - I can use them! - Trifork Blog\",\"isPartOf\":{\"@id\":\"https:\/\/trifork.nl\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#primaryimage\"},\"thumbnailUrl\":\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png\",\"datePublished\":\"2011-04-01T13:59:48+00:00\",\"author\":{\"@id\":\"https:\/\/trifork.nl\/blog\/#\/schema\/person\/88be6f0de12503d08f3d5f18796e4051\"},\"breadcrumb\":{\"@id\":\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#primaryimage\",\"url\":\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png\",\"contentUrl\":\"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/trifork.nl\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Gimme all resources you have &#8211; I can use them!\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/trifork.nl\/blog\/#website\",\"url\":\"https:\/\/trifork.nl\/blog\/\",\"name\":\"Trifork Blog\",\"description\":\"Keep updated on the technical solutions Trifork is working on!\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/trifork.nl\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/trifork.nl\/blog\/#\/schema\/person\/88be6f0de12503d08f3d5f18796e4051\",\"name\":\"Simon Willnauer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/trifork.nl\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/254a556e9dde04a2d02ed76e5971a0fd?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/254a556e9dde04a2d02ed76e5971a0fd?s=96&d=mm&r=g\",\"caption\":\"Simon Willnauer\"},\"description\":\"I am a Apache Lucene PMC and core committer and work mainly on scalable distributed information retrieval systems as well as the Lucene core engine. I'm also a co-organizer of BerlinBuzzwords (http:\/\/www.berlinbuzzwords.de) an annual conference on Scalability Berlin.\",\"sameAs\":[\"http:\/\/www.jteam.nl\"],\"url\":\"https:\/\/trifork.nl\/blog\/author\/simonw\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Gimme all resources you have - I can use them! - Trifork Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/","og_locale":"en_US","og_type":"article","og_title":"Gimme all resources you have - I can use them! - Trifork Blog","og_description":"Exploiting full IO and CPU concurrency when indexing with Apache Lucene During the last year Apache Lucene has been improved an extreme amount with outstanding improvements such as 100 times faster FuzzyQueries, new Term-Dictionary implementation, enhanced Segment-Merging and the famous Flexible-Indexing API. Recently I started working on another fundamental change referred to as DocumentsWriterPerThread, an [&hellip;]","og_url":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/","og_site_name":"Trifork Blog","article_published_time":"2011-04-01T13:59:48+00:00","og_image":[{"url":"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png","type":"","width":"","height":""}],"author":"Simon Willnauer","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Simon Willnauer","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/","url":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/","name":"Gimme all resources you have - I can use them! - Trifork Blog","isPartOf":{"@id":"https:\/\/trifork.nl\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#primaryimage"},"image":{"@id":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#primaryimage"},"thumbnailUrl":"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png","datePublished":"2011-04-01T13:59:48+00:00","author":{"@id":"https:\/\/trifork.nl\/blog\/#\/schema\/person\/88be6f0de12503d08f3d5f18796e4051"},"breadcrumb":{"@id":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#primaryimage","url":"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png","contentUrl":"http:\/\/blog.jteam.nl\/wp-content\/uploads\/2011\/04\/trunk_indexing_small.png"},{"@type":"BreadcrumbList","@id":"https:\/\/trifork.nl\/blog\/gimme-all-resources-you-have-i-can-use-them\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/trifork.nl\/blog\/"},{"@type":"ListItem","position":2,"name":"Gimme all resources you have &#8211; I can use them!"}]},{"@type":"WebSite","@id":"https:\/\/trifork.nl\/blog\/#website","url":"https:\/\/trifork.nl\/blog\/","name":"Trifork Blog","description":"Keep updated on the technical solutions Trifork is working on!","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/trifork.nl\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/trifork.nl\/blog\/#\/schema\/person\/88be6f0de12503d08f3d5f18796e4051","name":"Simon Willnauer","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/trifork.nl\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/254a556e9dde04a2d02ed76e5971a0fd?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/254a556e9dde04a2d02ed76e5971a0fd?s=96&d=mm&r=g","caption":"Simon Willnauer"},"description":"I am a Apache Lucene PMC and core committer and work mainly on scalable distributed information retrieval systems as well as the Lucene core engine. I'm also a co-organizer of BerlinBuzzwords (http:\/\/www.berlinbuzzwords.de) an annual conference on Scalability Berlin.","sameAs":["http:\/\/www.jteam.nl"],"url":"https:\/\/trifork.nl\/blog\/author\/simonw\/"}]}},"_links":{"self":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/posts\/3106","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/users\/107"}],"replies":[{"embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/comments?post=3106"}],"version-history":[{"count":0,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/posts\/3106\/revisions"}],"wp:attachment":[{"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/media?parent=3106"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/categories?post=3106"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/trifork.nl\/blog\/wp-json\/wp\/v2\/tags?post=3106"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}