Archives for the ‘Latest Articles’ category

How to run Norconex Collector in Docker

Introduction Docker is popular because it makes it easy to package and deliver programs. This article will show you how to run the Java-based, open-source crawler, Norconex HTTP Collector and Elasticsearch Committer in Docker to crawl a website and index ... Read More...


Posted on February 10, 2018 by in Latest Articles


Diagrams for Norconex Crawlers

Norconex just made it easier to understand the inner-workings of its crawlers by creating clickable flow diagrams. Those diagrams are now available as part of both the Norconex HTTP Collector and Norconex Filesystem Collector websites. Clicking on a shape will ... Read More...


Posted on May 15, 2017 by in Latest Articles


Indexing to an AWS CloudSearch Domain

Amazon Web Services (AWS) have been all the rage lately, used by many organizations, companies and even individuals. This rise in popularity can be attributed to the sheer number of services provided by AWS, such as Elastic Compute (EC2), Elastic ... Read More...


Posted on May 4, 2017 by in Latest Articles


How Web Crawling Can Benefit Your Business

There are many business applications where web crawling can be of benefit. You or your team likely have ongoing research projects or smaller projects that come up from time to time. You may do a lot of manual web searching ... Read More...


Posted on September 19, 2016 by in Latest Articles


Google Search Appliance is Being Phased Out… Now What?

Google Search Appliance (GSA) was introduced in 2002, and since then, thousands of organizations have acquired Google “search in a box” to meet their search needs. Earlier this year, Google announced they are discontinuing sales of this appliance past 2016 ... Read More...


Posted on April 15, 2016 by in Latest Articles


Use Solr 5 with Docker

Docker is all the rage at the moment! It was recently selected as Gartner Cool Vendor in DevOps. As you may already know, Docker is a platform to build and deploy applications as self-contained units. Those units, called containers, can ... Read More...


Posted on May 1, 2015 by in Latest Articles


Data Mining with Solr 5 – How to Slice and Dice Your Data With Facet Pivot and the Stats Module

Introduction You already know that Solr is a great search application, but did you know that Solr 5 could be used as a platform to slice and dice your data?  With Pivot Facet working hand in hand with Stats Module, ... Read More...


Posted on April 9, 2015 by in Latest Articles


How to Run Solr as a Service on Windows

In this tutorial, I will show you how to run Solr as a Microsoft Windows service. Up to version 5.0.0, it was possible to run Solr inside the Java web application container of your choice. However, since the release of ... Read More...


Posted on March 25, 2015 by in Latest Articles


What’s new in Solr 5

I am very excited about the new Solr 5. I had the opportunity to download and install the latest release, and I have to say that I am impressed with the work that has been done to make Solr easy ... Read More...


Posted on March 4, 2015 by in Latest Articles


Create a website broken links checker

This tutorial will show you how to extend Norconex HTTP Collector using Java to create a link checker to ensure all URLs in your web pages are valid. The link checker will crawl your target site(s) and create a report ... Read More...


Posted on February 10, 2015 by in Latest Articles