Vous êtes sur la page 1sur 13

SOLR CONFIGURATION

Version 1.0

A GUIDE TO INSTALLING OPEN SOURCE SEARCH SOLUTION SOLR ON WINDOWS AND LINUX

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

............................................................................................................................ CONTENTS

Introduction.........................................................................................................................3 Solr Setup............................................................................................................................4 Linux Notes ...................................................................................................................................... 5 Clustering/Load Balancing Two Physical Machines ....................................................6 Tomcat setup..................................................................................................................................... 6 Apache setup..................................................................................................................................... 6 Linux Notes ...................................................................................................................................... 7 Clustering/Load Balancing One Physical Machine.......................................................9 Tomcat setup..................................................................................................................................... 9 Apache setup..................................................................................................................................... 9 Linux Notes .................................................................................................................................... 10 Replication ........................................................................................................................12 Sample Production Architecture .....................................................................................13

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

Introduction
This document describes the process of setting up Solr on a Windows or Linux machine and integration with the popular web servers Apache and Tomcat. The following topics are discussed: Solr Setup Clustering/Load balancing on two physical machines Clustering/Load balancing on one physical machine with two logical instances Replication Prerequisites: For the purposes of this document, well assume that java (jdk1.5.x and above), apache http 2.x web server, apache-solr-1.4.x and tomcat 6.x have been downloaded and installed to the following directories: Windows: Java Apache WebServer Tomcat Solr Linux: Java Apache WebServer Tomcat Solr

c:\java c:\software\Apache2.2 c:\software\apache-tomcat-6.0.29\qs1 c:\software\apache-solr-1.4.1

/usr/java /etc/httpd /usr/apache-tomcat-6.0.29/qs1 /usr/apache-solr-1.4.1

MUST BE INSTALLED AS ROOT

Note the special Linux notes for clustering in the document.

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

Solr Setup
Ensure that tomcat roles look like this in the file c:\software\apache-tomcat-6.0.29\qs1\conf\tomcatusers.xml: <tomcat-users> <role rolename="tomcat" /> <role rolename="manager" /> <user username="tomcat" password="tomcat" roles="tomcat,manager" /> <user username="admin" password="admin" roles="tomcat,manager" /> </tomcat-users> Note: Tomcat 7.0 split the manager role into multiple roles: manager-gui,manager-script,managerjmx,manager-status. Start the server and access http://localhost:8080/manager/html to make sure the manager application is accessible. The manager application comes in handy for deployment of web applications. Copy c:\software\apache-solr-1.4.1\example\solr to c:\software\apache-tomcat-6.0.29\qs1. Copy c:\software\apache-solr-1.4.1\dist\ apache-solr-1.4.1.war to c:\software\apache-tomcat6.0.29\qs1\solr The directory c:\software\apache-tomcat-6.0.29\qs1\solr directory should now look like the following. (tree with important files shown below) +---solr | | apache-solr-1.4.1.war | +---conf | | | schema.xml | | | solrconfig.xml | | | | | \---xslt Update the following element in c:\software\apache-tomcat-6.0.29\qs1\solr\conf\solrconfig.xml to point to the data directory <dataDir>${solr.data.dir:c:/software/apache-tomcat6.0.29/qs1/solr/data}</dataDir> Create the following fragment in c:\software\apache-tomcat-6.0.29\qs1\conf\Catalina\localhost. This step will deploy the application into tomcat automatically when the server starts. <?xml version="1.0" encoding="utf-8" ?> <Context docBase="c:/software/apache-tomcat-6.0.29/qs1/solr/apachesolr-1.4.1.war" debug="0" crossContext="true"> <Environment name="solr/home" type="java.lang.String" value="c:/software/apache-tomcat-6.0.29/qs1/solr" override="true" /> </Context> Restart the tomcat server Access http://localhost:8080/solr/admin/ to ensure solr admin console is available. At this point solr is installed on tomcat but it doesnt have any index files available. Some test data in xml form is available under the solr installation: c:\software\apache-solr-1.4.1\example\exampledocs. This directory also comes with a file post.jar. The xml data can be posted to the solr application and the index built as follows: cd c:\software\apache-solr-1.4.1\example\exampledocs\

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

java -Durl=http://localhost:8080/solr/update -jar post.jar *.xml Indexing the first time creates the index files under the data directory specified in solfconfig.xml (c:/software/apache-tomcat-6.0.29/qs1/solr/data). The data directory is automatically created. The updated solr directory should look similar to the following: +---solr | | apache-solr-1.4.1.war | +---conf | | | schema.xml | | | solrconfig.xml | | | | | \---xslt | | | \---data | +---index | | segments.gen | | segments_2 | \---spellchecker | segments.gen | segments_1

Now the solr index is built with sample data. In the solr admin console, search for ipod and the search results should be visible.

Linux Notes
The above discussion uses windows file system paths as examples. There are no specific changes required for Linux except using the appropriate file system paths.

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

Clustering/Load Balancing Two Physical Machines Tomcat setup


Set up each Tomcat + Solr instance as shown in the Solr Setup section of this document on MACHINE1 Copy the instance to MACHINE2 and rename it to qs2: c:\software\apache-tomcat-6.0.29\qs2 Add a jvmRoute element in MACHINE1: c:\software\apache-tomcat-6.0.29\qs1\conf\server.xml: <Engine name="Catalina" defaultHost="localhost" jvmRoute="worker1"> Add a jvmRoute element in MACHINE2: c:\software\apache-tomcat-6.0.29\qs2\conf\server.xml: <Engine name="Catalina" defaultHost="localhost" jvmRoute="worker2">

Apache setup
Install apache on machine0 Download mod_jk module for windows its a single file with the extension .so (shared object) Rename the file to mod_jk.so and copy it to c:\software\Apache2.2\conf\modules directory Create a workers.properties file in c:\software\Apache2.2\conf directory with the following contents Entries marked in bold should be modified based on the current settings and whether tomcat was set up on one physical machine with two logical instances or two physical machines. workers.java_home=c:/java worker.list=balancer worker.worker1.port=8009 worker.worker1.host=machine1 worker.worker1.type=ajp13 worker.worker1.lbfactor=1 worker.worker2.port=8009 worker.worker2.host=machine2 worker.worker2.type=ajp13 worker.worker2.lbfactor=1 worker.balancer.type=lb worker.balancer.balance_workers=worker1,worker2 worker.balancer.method=B # Specifies whether requests with SESSION ID's # should be routed back to the same #Tomcat worker. #worker.balancer.sticky_session =True Add the following in httpd.conf file after the last LoadModule statement and make the necessary changes Entries marked in bold should be modified based on the current settings. ###### TOMCAT INTEGRATION ############### # Load the jk module LoadModule jk_module modules/mod_jk.so # Path to workers.properties JkWorkersFile "c:/software/Apache2.2/conf/workers.properties" # Path to jk logs

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

JkLogFile "c:/software/Apache2.2/logs/mod_jk.log" # Jk log level [debug/error/info] JkLogLevel info # Jk log format JkLogStampFormat "[%a %b %d %H:%M:%S %Y] " # JkOptions for forwarding JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories # JkRequestLogFormat set the request format JkRequestLogFormat "%w %V %T" JkMount /solr balancer JkMount /solr/* balancer ###### END TOMCAT INTEGRATION ############### Start apache on machine0 and access it like so: http://machine0/solr The above should automatically balance the requests between machine1 and machine2.

Linux Notes
The following changes are required in Apache configuration for Linux: Download mod_jk module for Linux and copy to /etc/httpd/modules directory. The httpd.conf file gets an additional entry in Linux in the Tomcat integration section. The entry for the shared memory location is highlighted below in yellow. ###### TOMCAT INTEGRATION ############### # Load the jk module LoadModule jk_module modules/mod_jk.so # Path to workers.properties JkWorkersFile "/etc/httpd/conf/workers.properties" # Path to jk logs JkLogFile "/etc/httpd/logs/mod_jk.log" # Jk log level [debug/error/info] JkLogLevel info # Jk log format JkLogStampFormat "[%a %b %d %H:%M:%S %Y] " # JkOptions for forwarding JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories # JkRequestLogFormat set the request format JkRequestLogFormat "%w %V %T" JkShmFile /var/cache/httpd/mod_jk.shm JkMount /solr balancer JkMount /solr/* balancer ###### END TOMCAT INTEGRATION ############### For SELinux (Security-Enhanced Linux), the next step is needed to ensure that Apache has access to the shared memory file specified above in httpd.conf. o Update access to the shared memory location specifed in httpd.conf (JkShmFile /var/cache/httpd/mod_jk.shm) - this needs to be done as ROOT [root@CASLRWEBDEV1 files]# mkdir /var/cache/httpd [root@CASLRWEBDEV1 files]# setfiles -v -l -d /etc/selinux/targeted/contexts/files/file_contexts /var/cache/httpd
setfiles: labeling files under /var/cache/httpd

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

setfiles: /var/cache/httpd matched by system_u:object_r:httpd_cache_t:s0 setfiles: relabeling /var/cache/httpd from user_u:object_r:var_t:s0 to system_u:object_r:httpd_cache_t:s0 matchpathcon_filespec_eval: hash table stats: 1 elements, 1/65536 buckets used, longest chain length 1 setfiles: Done.

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

Clustering/Load Balancing One Physical Machine Tomcat setup


Set up one Tomcat + Solr instance as shown in the Solr Setup section of this document Add a jvmRoute element in c:\software\apache-tomcat-6.0.29\qs1\conf\server.xml: <Engine name="Catalina" defaultHost="localhost" jvmRoute="worker1"> Copy c:\software\apache-tomcat-6.0.29\qs1 to c:\software\apache-tomcat-6.0.29\qs2 In c:\software\apache-tomcat-6.0.29\qs2\Catalina\localhost\solr.xml, replace qs1 with qs2. In c:\software\apache-tomcat-6.0.29\qs2\solr\conf\solrconfig.xml, replace qs1 with qs2. In c:\software\apache-tomcat-6.0.29\qs2\conf\server.xml, locate the following elements and replace the values according to the matrix shown below. Element Type Engine jvmRoute SHUTDOWN port Connector port (HTTP) Connector port (AJP) Find Value worker1 8005 8080 8009 Replace Value worker2 8025 8090 8019

Apache setup
Install apache on machine0 Download mod_jk module for windows its a single file with the extension .so (shared object) Rename the file to mod_jk.so and copy it to c:\software\Apache2.2\conf\modules directory Create a workers.properties file in c:\software\Apache2.2\conf directory with the following contents Entries marked in bold should be modified based on the current settings and whether tomcat was set up on one physical machine with two logical instances or two physical machines. Note that the stick_session property must be set to false on one physical machine with two logical instances otherwise load balancing doesnt work properly. workers.java_home=c:/java worker.list=balancer worker.worker1.port=8009 worker.worker1.host=machine1 worker.worker1.type=ajp13 worker.worker1.lbfactor=1 worker.worker2.port=8019 worker.worker2.host=machine1 worker.worker2.type=ajp13 worker.worker2.lbfactor=1 worker.balancer.type=lb worker.balancer.balance_workers=worker1,worker2 worker.balancer.method=B # Specifies whether requests with SESSION ID's # should be routed back to the same #Tomcat worker. worker.balancer.sticky_session=False Add the following in httpd.conf file after the last LoadModule statement and make the necessary changes Entries marked in bold should be modified based on the current settings.

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

###### TOMCAT INTEGRATION ############### # Load the jk module LoadModule jk_module modules/mod_jk.so # Path to workers.properties JkWorkersFile "c:/software/Apache2.2/conf/workers.properties" # Path to jk logs JkLogFile "c:/software/Apache2.2/logs/mod_jk.log" # Jk log level [debug/error/info] JkLogLevel info # Jk log format JkLogStampFormat "[%a %b %d %H:%M:%S %Y] " # JkOptions for forwarding JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories # JkRequestLogFormat set the request format JkRequestLogFormat "%w %V %T" JkMount /solr balancer JkMount /solr/* balancer ###### END TOMCAT INTEGRATION ############### Start apache on machine0 and access it like so: http://machine0/solr The above should automatically balance the requests between the two instances.

Linux Notes
The following changes are required in Apache configuration for Linux: Download mod_jk module for Linux and copy to /etc/httpd/modules directory. The httpd.conf file gets an additional entry in Linux in the Tomcat integration section. The entry for the shared memory location is highlighted below in yellow. ###### TOMCAT INTEGRATION ############### # Load the jk module LoadModule jk_module modules/mod_jk.so # Path to workers.properties JkWorkersFile "/etc/httpd/conf/workers.properties" # Path to jk logs JkLogFile "/etc/httpd/logs/mod_jk.log" # Jk log level [debug/error/info] JkLogLevel info # Jk log format JkLogStampFormat "[%a %b %d %H:%M:%S %Y] " # JkOptions for forwarding JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories # JkRequestLogFormat set the request format JkRequestLogFormat "%w %V %T" JkShmFile /var/cache/httpd/mod_jk.shm JkMount /solr balancer JkMount /solr/* balancer ###### END TOMCAT INTEGRATION ############### For SELinux (Security-Enhanced Linux), the next two steps are needed to ensure that Apache has access to the shared memory file specified above in httpd.conf and to allow it access to 8019 AJP port.

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

10

Update access to the shared memory location specifed in httpd.conf (JkShmFile /var/cache/httpd/mod_jk.shm) - this needs to be done as ROOT [root@CASLRWEBDEV1 files]# mkdir /var/cache/httpd [root@CASLRWEBDEV1 files]# setfiles -v -l -d /etc/selinux/targeted/contexts/files/file_contexts /var/cache/httpd
setfiles: labeling files under /var/cache/httpd setfiles: /var/cache/httpd matched by system_u:object_r:httpd_cache_t:s0 setfiles: relabeling /var/cache/httpd from user_u:object_r:var_t:s0 to system_u:object_r:httpd_cache_t:s0 matchpathcon_filespec_eval: hash table stats: 1 elements, 1/65536 buckets used, longest chain length 1 setfiles: Done.

Allow access to the AJP 8019 port - by default only 8009 has access - this needs to be done as ROOT
semanage port -l | grep -w http_port_t (displays allowed ports)

semanage port -a -t http_port_t -p tcp 8019 (adds access to 8019 port)

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

11

Replication
Replication involves automatically copying indexes from one solr machine to another when indexing is run. In the figure below, indexing is run on replication master and query servers 1 and 2 poll the master every 10 seconds (configurable) and pull the latest index

Setup a replication master server as shown in the section Solr Setup in the following directory: c:\software\apache-tomcat-6.0.29\repmstr if the setup is on a single physical machine follow the appropriate instructions to avoid port/data directory conflicts (refer page no. 8) In the file c:\software\apache-tomcat-6.0.29\repmstr\conf\solrconfig.xml, uncomment the replication section and remove the slave element from it <requestHandler name="/replication" class="solr.ReplicationHandler"> <lst name="master"> <str name="replicateAfter">commit</str> <str name="replicateAfter">startup</str> <str name="confFiles">schema.xml,stopwords.txt</str> </lst> </requestHandler>

Setup query servers 1 and 2 (qs1 and qs2) as shown in the section Clustering/Load Balancing Tomcat setup In the c:\software\apache-tomcat-6.0.29\qs1\conf\solrconfig.xml and c:\software\apache-tomcat6.0.29\qs2\conf\solrconfig.xml, uncomment the replication section and remove the masterelement from it. Further configure the masterUrl to point to the replication master instance. <requestHandler name="/replication" class="solr.ReplicationHandler"> <lst name="slave"> <str name="masterUrl">http://master-server/solr/replication</str> <str name="pollInterval">00:00:60</str> lst </lst> </requestHandler> Run indexing on master server. The indexes will be automatically refreshed on the query servers 1 and 2: java -Durl=http://master-server/solr/update -jar post.jar *.xml

Linux Notes: The above discussion uses windows file system paths as examples. There are no specific changes required for Linux except using the appropriate file system paths.

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

12

Sample Production Architecture


The following is representative production architecture. The sizing recommendations are for Linux.

Property of Skillnet Solutions Inc. For internal circulation only. All rights reserved.

13

Vous aimerez peut-être aussi