Thursday, March 31, 2016

Accessing Solr Cloud on AWS from SolrJ

We have a Solr cloud installation on AWS EC2 instances. We use the SolrJ Client from our Java application. Till date we used to have a Solr Cloud installation on our local machine in order to test the code against. As the team started growing, we realized that we should have a way to access the AWS Solr Cloud from our local boxes, so that the Solr Cloud setup on local is not a blocker for feature development.

When I started looking around the web for a solution to this issue, I had to mix a few things from a few different documentation pages and this stackoverflow page. Decided to write this blog post to help out others who are running into the same issue. I hope it helps.

So first let's try to understand the reason behind why you get the java.net.UnknownHostException from the SolrJ client.
When the solr cloud is run, each instance registers itself with the zookeeper that is running. This is done with the private IP of the solr machine. So when SolrJ connects to zookeeper, it gets the private IP. And then SolrJ tries to hit the Solr instance with the private IP, leading to an unknown host exception.

In order to get around this you can follow these steps:
Step 1: Run the solr instance with a host name. Zookeeper will use this hostname.
./bin/solr start -cloud -h hostname -p 8985 -z localhost:2181

Step 2: Add the host entry to the /etc/hosts file on your local machine corresponding to the public ip of the solr machine.

And you are done. You can run the java application that uses SolrJ client on your local machine and access the Solr Cloud running on your AWS instance :)

No comments :

Post a Comment