Jekyll2019-04-23T15:06:04+00:00http://tarunjangra.com/Tarun JangraTarun JangraTarun Jangratarun@izap.inServing Authenticated Static Content was pretty expensive before today2016-12-19T00:00:00+00:002016-12-19T00:00:00+00:00http://tarunjangra.com/2016/12/19/serving-authenticated-static-content<p>It has always been the pain when we need to server authenticated static content. Because we are bound with programming framework
to handle the authentication job. And once authenticated, </p>
<!--more-->
<p>you have to read the file from the Disk through your programming to
stream to the end user with correct mime type. This was the only solution for me before today.</p>
<p><img src="http://i.imgur.com/ABtCr5z.jpg" alt="Nginx x-accel module"></p>
<p>While working on some project, i’ve found XSendfile and X-Accel. X-accel allows for internal redirection to a location
determined by a header returned from a backend.</p>
<p>This allows you to handle authentication, logging or whatever else you please in your backend and then have NGINX handle
serving the contents from redirected location to the end user, thus freeing up the backend to handle other requests. This
feature is commonly known as X-Sendfile.</p>
<p>NGINX also has this feature, but implemented a little bit differently. In NGINX this feature is called X-Accel-Redirect.</p>
<p>There are two main differences:</p>
<ol>
<li>The header must contain a URI.</li>
<li>The location should be defined as internal; to prevent the client from going directly to the URI.</li>
</ol>
<p>We have been missing this feature in our elgg development. Now we will definitely use this module to get the better
performance while service static content to the end user. No more stream reading from disk and serve further.</p>Tarun Jangratarun@izap.inIt has always been the pain when we need to server authenticated static content. Because we are bound with programming framework
to handle the authentication job. And once authenticated,My First post with Jekyll2016-07-09T00:00:00+00:002016-07-09T00:00:00+00:00http://tarunjangra.com/2016/07/09/hello-world<p>In this blog i do not have any thing particular to talk about. So it is just an introduction
of my new blog built on Jekyll. Since it is Jekyll based, so i've used Travis-CI for building
and github for hosting this blog.</p>
<!--more-->
<p>As described in <a href="/about-me.html">About Me</a>, I'm passionate about programming, cloud computing,
Entrepreneurship. So that's whyat i'll be writing about.</p>
<p>So just want to say "Hello" and Thank you to take time to read this blog.</p>Tarun Jangratarun@izap.inIn this blog i do not have any thing particular to talk about. So it is just an introduction
of my new blog built on Jekyll. Since it is Jekyll based, so i've used Travis-CI for building
and github for hosting this blog.Amazon Elastic Cloud Computing (EC2)2016-03-02T00:00:00+00:002016-03-02T00:00:00+00:00http://tarunjangra.com/2016/03/02/amazon-ec2<p>Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides re-sizable compute capacity in the cloud.
Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale
capacity, both up and down, as your computing requirements change.</p>
<!--more-->
<figure><img src='http://imgur.com/CBUgmAj.png' style='width:600px;max-width:100%;' alt='Amazon EC2'/><figcaption>Amazon EC2</figcaption></figure>
<p>We are working from last 10 years in IT I remember time where if we needed a new Active Directory Server or a new SQL Server
we have to go to HP or go to DELL order new servers. we then had to get deliver to our data centers. we had to get racked.
we had to do the networking setup them the internet accessible etc and you know your provisioning time should be anywhere
from 5 to 10 business days. </p>
<p>Then i started public cloud and was really exciting to see the capabilities of cloud in step having of 5 to 10 days
lead time you would reduce to literally just couple of minutes you can have that server up and running so that's
really how cloud computing change the IT industry in the last 5 to 10 years so Amazon EC2 changes the economics of
computing by allowing you to pay only for capacity that you actually use. Amazon EC2 provides developers the tools
to build failure resilient applications and isolate themselves from common failure scenarios. So we just look at the first
section the advantage of the cloud computing is utility based model you can pay only by the hour. If you want to spin up
the development environment and just test on it and then terminate you only pay for 1 or 2 hours the environment is live
the old model way you would buy the server hardware you would be stuck with it.</p>
<h2 id="elastic-compute-cloud-pricing-options">Elastic Compute Cloud Pricing Options</h2>
<ol>
<li><strong>Free Tier</strong> you get 735 hours free on certain micro instances.</li>
<li><strong>On Demand</strong> Which allow you to pay a fixed rate by the hour with no commitment.</li>
<li><strong>Reserved</strong> Which provide you with a capacity reservation, and offer a significant discount on the hourly charge for
an instance. Then you have 1 Year or 3 Year Terms so reserved just saying i need 10 servers of this size and i am willing
to pay either up-front contractual willing to commit for 1 to 3 years and if you do use reserved instances then you get
massive discounts compared with on demand.</li>
<li><strong>Spot</strong> This is enable you to bid whatever price you want to pay for instance capacity, providing for even greater savings
if your applications have flexible start and end times.</li>
</ol>
<h2 id="elastic-compute-cloud-on-demand-vs-reserved-vs-spot">Elastic Compute Cloud On Demand vs Reserved vs Spot</h2>
<ol>
<li><strong>On Demand Instances</strong>
<ul>
<li>Users that want the low cost and flexibility of Amazon EC2 without any up-front payment or long-term commitment.</li>
<li>Applications with short term, spike, or unpredictable workloads that cannot be interrupted.</li>
<li>Applications being developed or tested on Amazon EC2 for the first time.</li>
</ul></li>
<li><strong>Reserved Instances</strong>
<ul>
<li>Applications with steady state or predictable usage so reserved might be your 3 or 4 web servers that you always want
to turned on and then your on demand instances might be is the part of an auto scaling event.</li>
<li>Applications that require reserved capacity.</li>
<li>Users able to make upfront payment to reduce their total computing costs even further.</li>
</ul></li>
<li><strong>Spot Instances</strong>
<ul>
<li>Applications that have flexible start and end times.</li>
<li>Applications that are only feasible at very low compute prices.</li>
<li>Users with urgent computing needs for large amounts of additional capacity.</li>
</ul></li>
</ol>
<h2 id="elastic-compute-cloud-nbsp-on-demand-instances">Elastic Compute Cloud On Demand Instances**</h2>
<ol>
<li>General Purpose Instances</li>
<li>Compute Optimized Instances
<ul>
<li>Compute Intensive Applications</li>
</ul></li>
<li>Memory Optimized Instances
<ul>
<li>Database & Memory Caching Applications</li>
</ul></li>
<li>GPU Instances Instances
<ul>
<li>High Performance Parallel Computing (eg Hadoop)</li>
</ul></li>
<li>Storage Optimized Instances
<ul>
<li>Data warehousing and Parallel Computing</li>
</ul></li>
</ol>
<h2 id="local-instance-storage-vs-elastic-block-storage">Local Instance Storage vs Elastic Block Storage</h2>
<ol>
<li><strong>Local Instance Storage</strong>: Data stored on a local instance store will persist only as long as that instance is alive. So you terminate that
Instances you loose all the data on that virtual hardware.</li>
<li><strong>Elastic Block Storage Backed Storage</strong>: Data that is stored on an Amazon Elastic Block Storage volume will persist independently of the life of the instance.</li>
</ol>
<h2 id="storage-backed-by-elastic-block-storage">Storage backed by Elastic Block Storage</h2>
<ol>
<li><strong>Provisioned IOPS Solid State Drive</strong>
<ul>
<li>Designed for I/O intensive applications such as large relational or No-SQL databases.</li>
</ul></li>
<li><strong>General purpose Solid State Drive</strong>
<ul>
<li>Designed for 99.999% availability.</li>
<li>Ratio of 3 IOPS per GB, offer single digit millisecond latency, and also have the ability to burst up to 3000 IOPS
for short periods.</li>
</ul></li>
<li><strong>Magnetic</strong>
<ul>
<li>Lowest cost per gigabyte of all Elastic Block Storage volume types. Magnetic volumes are ideal for workloads where data
is accessed infrequently, and applications where the lowest storage cost is important.</li>
</ul></li>
</ol>Tarun Jangratarun@izap.inAmazon Elastic Compute Cloud (Amazon EC2) is a web service that provides re-sizable compute capacity in the cloud.
Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale
capacity, both up and down, as your computing requirements change.ElasticSearch restore failed when s3-gateway is activated2014-07-11T00:00:00+00:002014-07-11T00:00:00+00:00http://tarunjangra.com/2014/07/11/elasticSearch-restore-failed-when-s3-gateway-activated<p>Hufffff, Unfortunately i met this edge case. I have recovered from this situation. Here’s my scenario.</p>
<ul>
<li>I am on ElasticSearch Version 1.1.0</li>
<li>I have two data nodes. One is primary and other is replica.</li>
<li>I am taking regular snapshots of my indexes.</li>
<li>I am no more taking snapshots, So I have installed s3-gateway plugin to keep updating s3 buckets for persistent indexes.</li>
</ul>
<!--more-->
<p>Because of bulk import, i have stopped my replica to make import little faster. Once import get completed. I felt high CPU and Memory usage. And since i was aware that my indexes are safe because i am supporting s3-gateway. So i decided to restart remaining data node. Fuck…. It was a big mistake. When i tried to restart, it was not recovering all indexes. And we were about to launch our site in next two hours. And i am left with no index.</p>
<p>Struggling here and there, i came to know that i am suffered with Bug in ElasticSearch. I tried to follow instruction at the end of this thread where i was suppose to update/edit metadata file from s3-bucket. I did that but no luck.</p>
<p>Problem i found, All indexes and shards suppose to have _source folders. And i had so many indexes and their shards where _source folder was missing. And those indexes were unrecoverable. I have no solutions at that place and was literately sweating in Air Conditioned Room. </p>
<p>Then one of my colleague, Narinder Kaur has joined me. And she gave me necessary support and we tried some more messes to fix it. Since i already made a mistake, So i took one backup of existing elasticsearch so that i would be able to back at same place in case of any other mess. And solutions we were planning to try was totally crap.</p>
<p>So, Solution we tried. and which actually works….. Wow!.</p>
<ol>
<li>I updated my elasticsearch.yml, and remove s3-gateway settings related to my s3 bucket.</li>
<li>I stopped elasticsearch.</li>
<li>I rename my old cluster (elasticsearch) to elasticsearch.original.</li>
<li>Restarted Elasticsearch. And it created new blank cluster. where i have no indexes.</li>
<li>I created all required indexes with the same number of shards and replicas i previously had. In my case i had 5 indexes and 5 shards per index.</li>
<li>Now i stop elasticsearch again.</li>
<li>Start deleting (elasticsearch/nodes/0/indices/{index_name}>/{0,1,2,3,4}/{index,translog}. And move (elasticsearch.original/nodes/0/indices/{index_name}/{0,1,2,3,4}/{index,translog}) to (elasticsearch/nodes/0/indices/{index_name}/<0,1,2,3,4>/{index,translog})
<strong>Note</strong>: Here, i did not touch _state folder of blank indexes. And now my all indexes has _status folder in each shard and each index.</li>
<li>I copied all indexes as in 5th step.</li>
<li>Restart ElasticSearch. and i found all indexes were recovered.</li>
</ol>
<p>Observation: Well you should run your all custom mappings in blank indexes. I found some errors because i did not execute my mapping.</p>
<p>Thank god, Now all indexes were recovered. And Thanks to Narinder Kaur, she got me required support at that time.</p>Tarun Jangratarun@izap.inHufffff, Unfortunately i met this edge case. I have recovered from this situation. Here’s my scenario.
I am on ElasticSearch Version 1.1.0
I have two data nodes. One is primary and other is replica.
I am taking regular snapshots of my indexes.
I am no more taking snapshots, So I have installed s3-gateway plugin to keep updating s3 buckets for persistent indexes.How to install go-daddy ssl certificate on amazon load balancer2012-12-29T00:00:00+00:002012-12-29T00:00:00+00:00http://tarunjangra.com/2012/12/29/how-to-install-godaddy-ssl-on-ELB<p>I was struggling around to install SSL Certificate on ELB. And finally i’ve made that. Following are the steps you need to follow.</p>
<h3 id="requirements-amp-prerequisites">Requirements & Prerequisites:</h3>
<ol>
<li>Linux having openssl and apache installed.</li>
<li>Open shell terminal on your Linux Box.</li>
</ol>
<!--more-->
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">openssl genrsa -des3 -out private.key 1024
openssl req -new -key private.key -out www.your-web-site.com.csr</code></pre></figure>
<p>You will be prompt to provide some basic information. Make sure you have added “Common Name”; a fully qualified domain name. like “www.xyz.com”</p>
<ol>
<li>Open to <a href="http://www.godaddy.com">GoDaddy</a> and go to ssl management control panel</li>
<li>Select your Certificate. And click on Re-Key button.</li>
<li>Copy content of “www.your-web-site.com.csr” and paste the content in “CSR” field. And press Re-Key.</li>
<li>It will prompt you to download the keys. Available options to download are Apache, Nginx and Other. By the way, i used “Other” to download my keys to be used on ELB.</li>
<li>Now unzip the downloaded file. It should have two *.crt files.</li>
</ol>
<h3 id="now-back-to-your-terminal">Now back to your terminal.</h3>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">openssl rsa -in private.key -out private.pem</code></pre></figure>
<p>Now you will have following files in your current location.</p>
<ol>
<li>private.key</li>
<li>private.pem</li>
<li>”www.web-site.com.csr”</li>
<li>sf_bundle.crt</li>
<li>your-domain.com.crt</li>
</ol>
<p>Now open your load balancer console and add https support. it will prompt you to add following values.</p>
<ol>
<li>Certificate Name:* -> Put any friendly name</li>
<li>Private Key:* -> Paste content of private.pem</li>
<li>Public Key Certificate:* -> Paste content of your-domain.com.crt.</li>
<li>Certificate Chain: -> Paste content of sf_bundle.crt</li>
</ol>
<p>Once done, Save all these values and here you go.</p>Tarun Jangratarun@izap.inI was struggling around to install SSL Certificate on ELB. And finally i’ve made that. Following are the steps you need to follow.
Requirements & Prerequisites:
Linux having openssl and apache installed.
Open shell terminal on your Linux Box.Logical Volume Manager (LVM) can help if you are out of space2012-11-12T00:00:00+00:002012-11-12T00:00:00+00:00http://tarunjangra.com/2012/11/12/Logical-Volume-Manager-LVM-can-help-if-you-are-out-of-space<p>Today i was wondering when i found, my ubuntu server’s home partition is about to finish. It was having lots of projects we are
working on. Replacing the old hardisk with the new of bigger size is one solution but it is so much time consuming. Ohhhh it is so scary.
Copy every thing from old to new hard drive. Install every single application and library my scripts needed. </p>
<!--more-->
<p><img src="http://tarunjangra.com/images/assets/LVM_original_description.png" alt="Logical Volume Manager"></p>
<p>Obviously that is time consuming process. But thanks to Logical Volume Manager(LVM). Fortunately i have used LVM to configure my old hard drive
and that really helped me to extend my “home” drive in minutes without copying and all boring stuff as i explained above. My old
hard disk scheme was:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">100MB /boot
73GB PV <span class="o">(</span>Physical Volume <span class="o">)</span>
<span class="m">3</span> GB /myDB <span class="o">(</span> my database directory<span class="o">)</span>
45GB /home <span class="o">(</span>All my projects are located in home<span class="o">)</span></code></pre></figure>
<p>So i was going out of space. What i did, I purchased new 1TB WD sata hard drive. And configure that on secondary sata port.
My ubuntu box detect the new hard drive. I make it sure by following command.</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">fdisk -l
<span class="c"># i got my both partitions as /dev/sda and dev/sdb (new one).</span>
vgdisplay
<span class="c"># I got the name of the volume group to be used.</span>
vgextend /dev/sdb
<span class="c"># this command put my new hard drive in the existing volume group.</span>
vgdisplay
<span class="c"># To make it sure if this new hard drive is actually added to the new group.</span>
lvextend -L 500G /dev/volume-group-name/drive-name
<span class="c"># Drive name was assigned to my "home" dir.</span>
resize2fs /dev/volume-group-name/drive-name
<span class="c"># This took about 10 mins to extend my home with more 500G.</span></code></pre></figure>
<p>So this is how i extend all space. I noticed while extending i was able to access all projects from that extending drive.
There was no crash or no restart (usually forced by windows for such tasks :) ). It means process is so efficient, you can use
your disk even while making this arrangement to increase more space. Anyway, That how i get to work everything within 10 mins.
It was really amazing experience.</p>Tarun Jangratarun@izap.inToday i was wondering when i found, my ubuntu server’s home partition is about to finish. It was having lots of projects we are
working on. Replacing the old hardisk with the new of bigger size is one solution but it is so much time consuming. Ohhhh it is so scary.
Copy every thing from old to new hard drive. Install every single application and library my scripts needed.Our development workflow with gitflow2012-01-19T00:00:00+00:002012-01-19T00:00:00+00:00http://tarunjangra.com/2012/01/19/our-development-workflow-with-git<p>We are using git since 2009. Recently we have been forced by a platform to implement better development workflow. Where we
handle better branching, code releases etc. And we found gitflow, A collection of git extensions provide high level of git
based operations. </p>
<!--more-->
<p>I found it pretty much worthy to share our experience. Earlier than gitflow, we were using git with Master
branch only where all developers suppose to push and code is suppose to move to development server and after testing,
it is suppose to deploy on production server. Which is bit cumbersom process. and as we are getting in the requirement of better
tracked development with less efforts we start feeling to have some serious process to get in. We have followed Vincent Driessen's
branching model.</p>
<p><img src="http://tarunjangra.com/images/assets/git-workflow-gitflow.png" alt="Gitflow"></p>
<p>Master branch will be now our production ready branch. And Development branch will be our dev server branch. These two branches
are suppose to be in the system for infinite time. We have learnt to keep some temporary branches like “Feature branches” and
“Release branches” which will really play a great role in the architecture we are workingin. We are using “Pivotal Tracker” for
our Agile methodology, So when we have new milestone with multiple stories for a particular feature. It means, developer
need to create new branche with the name “Feature/<feature-name>“. This branch is suppose to be cloned from master branch
and suppose to be in the system till the completion of the feature. And then merge back to master branch. So in the whole
release we are suppose to complete all pivotal stories by story ids.</p>
<p>I am looking for some automatic process where all stories get started when developer creates the Feature branch. And when
he deliver the whole feature and merge the branch back to the development. It should automatically change the status of the
story to be “Delivered”. QA team will test and either accept or reject the corresponding story. I know webhooks provided
by github.com which can be implemented to achieve this with pivotal tracker.</p>
<p>Overall, Gitflow methodology make the development flow quite better then what we were doing ealrier.</p>Tarun Jangratarun@izap.inWe are using git since 2009. Recently we have been forced by a platform to implement better development workflow. Where we
handle better branching, code releases etc. And we found gitflow, A collection of git extensions provide high level of git
based operations.Round-robin at application level to Balance MySQL Database Load2011-06-10T00:00:00+00:002011-06-10T00:00:00+00:00http://tarunjangra.com/2011/06/10/Round-robin-at-application-level-to-Balance-MySQL-Database-Load<p>Round robin technique facilitates you to distribute your read queries on number of available resources even if all servers are located
at different locations. Huge traffic sites like Facebook has to has such techniques working at the background to serve as fast as
possible.</p>
<!--more-->
<p>I would like to discuss one of my personal implementation experience for such a large potential social networking site.
Cloud computing is really help full but it also needs logical approach at programming level.</p>
<p><img src="http://tarunjangra.com/images/assets/mysql-57-clustering-the-developer-perspective-60-638.jpg" alt="MySQL Round-Robin"></p>
<h3 id="approach-1-six-servers-architecture-on-amazon-cloud">Approach 1: Six servers architecture on amazon cloud.</h3>
<p>WOW! I had implemented 1 load balancer, 1 mysql master db, 1 mysql slave db and 3 application server. Such an architecture<br>
can handle huge traffic. Since there is a separate application server layer where we can add more application servers anytime
we need. So user requests get balanced on 3 application servers and they get response. But in my application i had one more
problem. When user click on single link it executes 100+ SQLs because there is a framework overhead and some intentional queries.
Hmmmm, So MySql load is never balanced with this technique and it has to be. Because 1 request is triggering 100+ SQLs.
So i drill down to find out the solution and decided to separate sql reads and writes. Ok so with this, i get an opportunity
to divide separate Writes of MySQL db and initiated one mysql slave server.</p>
<h3 id="does-this-really-get-me-at-the-end-of-performance-level">Does this really get me at the end of performance level?</h3>
<p>No. Because we use read queires more frequently then write. Son in 100+ SQLs i have lesser database writes. So My write server
is still have idle resources.</p>
<h3 id="here-is-where-round-robin-comes-in">Here is where Round Robin comes in.</h3>
<p>If i could be able to develop a logic which distributes my 100+ SQLs to any number of replicated instances available.
That could really work for me. Say i have 5 read servers for 100+ SQL. Than i can distribute around 20 SQL per server per request.
And as we increase number of read server. System can adjust it self to distribute (SQL queries) / (Number of servers) (Qn / Sn).
In this way, all of my server will work for every SQL requested from the system. And I could get maximum performance from servers.
Because there is no use if we have 1000 Servers and 1 server is responding for 1 complete request. Because in this case 999
servers are free and which is wastage of Money. So i implemented that in My PHP application and that really makes sense to be
available on Cloud to use maximum resources.</p>Tarun Jangratarun@izap.inRound robin technique facilitates you to distribute your read queries on number of available resources even if all servers are located
at different locations. Huge traffic sites like Facebook has to has such techniques working at the background to serve as fast as
possible.How to create custom amazon AMI throught CLI Commands2011-05-11T00:00:00+00:002011-05-11T00:00:00+00:00http://tarunjangra.com/2011/05/11/how-to-create-custome-amazon-AMI<p>Today, i am going to explain how you can create custom amazon ami to launch instance anytime later.
This will have you clone of your server anytime you need. I am considering you are able to login your current running
instance and you also have your private key and certificate downloaded on some location. </p>
<!--more-->
<p><img src="http://tarunjangra.com/images/assets/ami_lifecycle.png" alt="Aamazon AMI"></p>
<p>Upload your private key and certificate on the running instance.</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">scp -i path/of/yourkeypair.pem path/of/cert.pem /mnt
scp -i path/of/yourkeypair.pem path/of/pk.pem /mnt</code></pre></figure>
<p>Login to your instance and check if uploaded files are available in /mnt.</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">ec2-bundle-vol
-d /mnt -k /mnt/pk.pem
-c /mnt/cert.pem
-u <span class="m">673491274719</span>
-p name-of-ami</code></pre></figure>
<p>This will take some time and create the desired ami to be uploaded in the bucket. So you can use that later anytime you need.
Now upload your bundle to amazon s3 storage.</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">ec2-upload-bundle
-b <S3-bucket-name>
-m /mnt/name-of-ami.manifest.xml
-a <AWS-access-key-id>
-s <AWS-secret-access-key>
--location US-EAST-1C</code></pre></figure>
<p>Note: Remember to upload to an S3 bucket in correct region. Also: if the bucket does not exit, it will be created for you.
(I’ve used a European bucket as an example.)
Now we need to register AMI. Do following< br /></p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">ec2-register <bucket-name>/sampleimage.manifest.xml --region US-EAST-1C</code></pre></figure>
<p>It will return the new AMI ID (like ami-).
That’s it you are done with your custom ami.</p>Tarun Jangratarun@izap.inToday, i am going to explain how you can create custom amazon ami to launch instance anytime later.
This will have you clone of your server anytime you need. I am considering you are able to login your current running
instance and you also have your private key and certificate downloaded on some location.Solr setup debian (lenny) + tomcat6 + solr2010-03-10T00:00:00+00:002010-03-10T00:00:00+00:00http://tarunjangra.com/2010/03/10/Solr-setup-debian-(lenny)-tomcat6-solr<p>I am working on a task to set solr enterprise search for elgg. I am as digging as getting surprised with this amazing
search utility. First i am going to explain how to install solr with tomcat6.x. </p>
<h3 id="requirements">Requirements:</h3>
<ol>
<li>JDK, JRE (OpenJDK, SunJDK)</li>
<li>Tomcat6.x</li>
<li>Latest solr</li>
</ol>
<!--more-->
<h3 id="installation-jdk-jre">Installation JDK,JRE:</h3>
<p>Well i used to setup openjdk and openjre on my lenny server. It is quite easy to use debian package manager. You can install
is using</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">apt-get install openjdk-6-jre, openjdk-6-jdk</code></pre></figure>
<p>And i was installed with all jdk and jre environment. You may need to setup JAVA_HOME environment variable if you do not wish
to install JDK at default location. You can do this in “.profile” located under your home or “/etc/profile” to make it enable for
all available users.</p>
<h3 id="download-tomcate6-x">Download tomcate6.x</h3>
<p>I downloaded tomcat binary “apache tomcat 6.0.24” and untar it at “/usr/local/”. You can choose any of your selected location.
So my location of all tomcat binaries was “/usr/local/tomcat”. That’s it, you have done with tomcat installation.
You can start tomcat as:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nb">cd</span> /usr/local/tomcat/bin/
./startup.sh</code></pre></figure>
<p>Now put localhost:8080 in your browser. You will see the response of tomcat server. Now next step is to install solr
as a tomcat application. It needs some configurations.</p>
<h3 id="installation-amp-configuration-of-solr">Installation & configuration of Solr</h3>
<p>Download apache solr and unzip it at any accessible location. Now create some directories under tomcat as</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">mkdir /usr/local/tomcat/data/solr/elgg/conf -p
mkdir /usr/local/tomcat/data/solr/elgg/data -p</code></pre></figure>
<p>Now we need to copy “apache-solr-1.4.0.war” file for tomcat deployment. Go to the directory where you unzip the solr file.
i found that file as “/apache-solr-1.4.0/dist/apache-solr-1.4.0.war”.</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">cp apache-solr-1.4.0/dist/apache-solr-1.4.0.war /usr/local/tomcat/data/solr</code></pre></figure>
<p>Now, in /usr/local/tomcat/conf/Catalina/localhost we need to create and save a files which will be read the next time you
start Tomcat, and properly deploy Solr. Use a text editor of your choice and create a files name “solrelgg.xml” in the
/usr/local/tomcat/conf//Catalina/localhost subdirectory. Put the contents as follow</p>
<figure class="highlight"><pre><code class="language-xml" data-lang="xml"><span class="nt"><Context</span> <span class="na">docBase=</span><span class="s">”/usr/local/tomcat/data/solr/apache-solr-1.4.0.war”</span>
<span class="na">debug=</span><span class="s">”0″</span> <span class="na">crossContext=</span><span class="s">”true”</span><span class="nt">></span>
<span class="nt"><Environment</span> <span class="na">name=</span><span class="s">”solr/home”</span> <span class="na">type=</span><span class="s">”java.lang.String”</span>
<span class="na">value=</span><span class="s">”/usr/local/tomcat/data/solr/elgg”</span> <span class="na">override=</span><span class="s">”true”</span> <span class="nt">/></span>
<span class="nt"></Context></span></code></pre></figure>
<p>Now go to “apache-solr-1.4.0/example/solr/conf” and copy all default configuration files in to our configured configuration
directory under tomcat.</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nb">cd </span>apache-solr-1.4.0/example/solr/conf
cp * -R /usr/local/tomcat/data/solr/elgg/conf
<span class="nb">cd</span> /usr/local/tomcat/data/solr/elgg/conf</code></pre></figure>
<p>Now edit “solrconfig.xml” and find “solr.data.dir” parameter. Change it’s value to new data directory. I gave relative path like “
../data” So now it was pointing to new data directory “/usr/local/tomcat/data/solr/elgg/data”. Well this edit is an optional
step. you can skip this. In that case, data directory will be created at default location according to the default value of
“solr.data.dir”.
Now start tomcat server using “/usr/local/tomcat/bin/startup.sh” and browse localhost:8080/solrelgg</p>
<p>It should show you “Welcome to Solr!” message with “Solr Admin” link.
I hope, it would work for you. Now elgg integration is just the matter of pushing new entities at create entity hooks and all
other crud operations.</p>Tarun Jangratarun@izap.inI am working on a task to set solr enterprise search for elgg. I am as digging as getting surprised with this amazing
search utility. First i am going to explain how to install solr with tomcat6.x.
Requirements:
JDK, JRE (OpenJDK, SunJDK)
Tomcat6.x
Latest solr