Showing posts with label solr. Show all posts
Showing posts with label solr. Show all posts

Sunday, June 14, 2015

Solr soft commit Gotcha - OOM

Without frequent hard commits, intense indexing rate bundled with Solr soft commits could lead to an out of memory error:


Our Solr Collection stores browsing history with a max search visibility requirement of 30 seconds. Having multiple writer processes, we designed to auto soft and auto hard commit to avoid explicit commits storm.

An OOM during an intense indexing session caught us by surprise . A quick heap dump inspection revealed a fat RAMDirectory in heap. Why? Well, soft commits uses Lucene NRT, which stores data in RAM, the memory is freed up once a hard commit arrives to persisted the data to disk ensuring durability.

The fault was in our auto hard commit policy, which was time only (maxTime=10min). If you index fast enough within those 10 minutes you'll run out of memory.
We fixed that by adding a maxDocs=50,000 limit.
Where maxDocs is calculated by:
[size of doc] X [num of docs] <= [memory we want to spend per core]
500b          X 50,000        <= 25m

We're currently running with 8 shard replicas so max memory usage for NRT would be: 200m.

So soft

Conclusion

When soft committing, make sure to limit heap usage by specifying both maxDocs and maxTime limits in your auto hard commit policy.
This is one of the factors that will affect your Solr memory usage.
Happy soft committing.


Thursday, February 19, 2015

Slow Solr Startup - Disabling the Suggester solved it

My Solr has 130M documents over 8 shards takes 20min of heavy CPU to start up.
The log showed big time gaps around the suggester being called:
[2/17/15 10:20:33:657 GMT] 000000ca SolrSuggester I org.apache.solr.spelling.suggest.SolrSuggester reload reload()
[2/17/15 10:20:33:657 GMT] 000000ca SolrSuggester I org.apache.solr.spelling.suggest.SolrSuggester build build() 
Suggester is configured by default in solrconfig.xml over non existing fields, despite that, and despite its request handler marked as Lazy startup it still does a huge amount of work.

Disabling the suggest searchComponent and requestHandler solved the issue.
I assume that the suggester is building a huge FST in memory.
 
<!--
<searchComponent name="suggest" class="solr.SuggestComponent">
   <lst name="suggester">
      <str name="name">mySuggester</str>
      <str name="lookupImpl">FuzzyLookupFactory</str>      <!-- org.apache.solr.spelling.suggest.fst -->
      <str name="dictionaryImpl">DocumentDictionaryFactory</str>     <!-- org.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory --> 
      <str name="field">cat</str>
      <str name="weightField">price</str>
      <str name="suggestAnalyzerFieldType">string</str>
    </lst>
  </searchComponent>

  <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="suggest">true</str>
      <str name="suggest.count">10</str>
    </lst>
    <arr name="components">
      <str>suggest</str>
    </arr>
  </requestHandler>
--> 
 
Later on I saw that someone went through the same thing before me.
Downloaded Solr 4.10 and saw it was disabled in stock solrconfig via to see it's fixed via SOLR-6679.