Posts tagged ‘Crawler’

How to run Norconex Collector in Docker

Introduction Docker is popular because it makes it easy to package and deliver programs. This article will show you how to run the Java-based, open-source crawler, Norconex HTTP Collector and Elasticsearch Committer in Docker to crawl a website and index ... Read More...


Posted on February 10, 2018 by in Latest Articles


Norconex HTTP Collector 2.8.0 released

Norconex is proud to announce the release of Norconex HTTP Collector version 2.8.0.  This release is accompanied by new releases of many related Norconex open-source products (Filesystem Collector, Importer, Committers, etc.), and together they bring dozens of new features and ... Read More...


Posted on November 26, 2017 by in Latest Releases


An Open-Source Crawler That Feeds an SQL Database

Norconex released an SQL Committer for its open-source crawlers (Norconex Collectors).  This enables you to store your crawled information into an SQL database of your choice. To define an SQL database as your crawler’s target repository, follow these steps: Download ... Read More...


Posted on May 29, 2017 by in Latest Releases


An Open-Source Crawler for Microsoft Azure Search

Norconex just released a Microsoft Azure Search Committer for its open-source crawlers (Norconex Collectors).  This empowers Azure Search users with full-featured file system and web crawlers. If you have not yet discovered Norconex Collectors, head over to the Norconex Collectors website ... Read More...


Posted on May 23, 2017 by in Latest Releases


Indexing to an AWS CloudSearch Domain

Amazon Web Services (AWS) have been all the rage lately, used by many organizations, companies and even individuals. This rise in popularity can be attributed to the sheer number of services provided by AWS, such as Elastic Compute (EC2), Elastic ... Read More...


Posted on May 4, 2017 by in Latest Articles


Norconex HTTP and Filesystem Collector 2.7.0 released

Norconex released version 2.7.0 of both its HTTP Collector and Filesystem Collector.  This update, along with related component updates, introduces several interesting features. HTTP Collector changes The following items are specific to the HTTP Collector.  For changes applying to both the ... Read More...


Posted on April 26, 2017 by in Latest Releases


How Web Crawling Can Benefit Your Business

There are many business applications where web crawling can be of benefit. You or your team likely have ongoing research projects or smaller projects that come up from time to time. You may do a lot of manual web searching ... Read More...


Posted on September 19, 2016 by in Latest Articles


Norconex HTTP Collector 2.6.0 released

Norconex has released version 2.6.0 of its HTTP Collector web crawler! Among new features, an upgrade of its Importer module brings new document parsing and manipulating capabilities. Some of the changes highlighted here also benefit the Norconex Filesystem Collector. New ... Read More...


Posted on August 25, 2016 by in Latest Releases


Norconex HTTP Collector 2.5.0 released

Norconex has released Norconex HTTP Collector version 2.5.0! This new version of our open source web crawler was released to help minimize your re-crawling frequencies and download delays, and it allows you to specify a locale for date parsing/formatting. The ... Read More...


Posted on June 3, 2016 by in Latest Releases


An Open-Source Crawler for Amazon CloudSearch

Norconex just released an Amazon CloudSearch Committer module for its open-source crawlers (Norconex “Collectors”). This is an especially useful contribution to CloudSearch users given that CloudSearch does not have its own crawlers. If you’re not yet familiar with Norconex Collectors, ... Read More...


Posted on April 28, 2016 by in Latest Releases