I inserted some debugging code in SearchManager and these are the results:
Total of currently indexed metadata:
41489
2009-10-27 14:24:39,392 INFO [geonetwork.index] - Indexing record (51212)
2009-10-27 14:24:39,407 INFO [geonetwork.index] - record schema (xxxxxx)
2009-10-27 14:24:39,407 INFO [geonetwork.index] - record createDate (2003-06-03T00:00:00)
2009-10-27 14:24:39,407 INFO [geonetwork.index] - Begin - Collect info
2009-10-27 14:24:40,907 INFO [geonetwork.index] - Begin - Lucene Indexing
2009-10-27 14:25:10,814 INFO [geonetwork.index] - Begin - Spatial Indexing
2009-10-27 14:25:10,845 INFO [geonetwork.index] - END
2009-10-27 14:29:16,001 INFO [geonetwork.index] - - record (51217)
2009-10-27 14:29:16,001 INFO [geonetwork.index] - Indexing record (51217)
2009-10-27 14:29:16,017 INFO [geonetwork.index] - record schema (xxxxx)
2009-10-27 14:29:16,017 INFO [geonetwork.index] - record createDate (2003-06-03T00:00:00)
2009-10-27 14:29:16,032 INFO [geonetwork.index] - Begin - Collect info
2009-10-27 14:29:17,657 INFO [geonetwork.index] - Begin - Lucene Indexing
2009-10-27 14:29:54,689 INFO [geonetwork.index] - Begin - Spatial Indexing
2009-10-27 14:29:54,735 INFO [geonetwork.index] - END
2009-10-27 14:30:32,048 INFO [geonetwork.index] - - record (51219)
2009-10-27 14:30:32,048 INFO [geonetwork.index] - Indexing record (51219)
2009-10-27 14:30:32,142 INFO [geonetwork.index] - record schema (xxxxx)
2009-10-27 14:30:32,142 INFO [geonetwork.index] - record createDate (2003-06-03T00:00:00)
2009-10-27 14:30:32,142 INFO [geonetwork.index] - Begin - Collect info
2009-10-27 14:30:33,970 INFO [geonetwork.index] - Begin - Lucene Indexing
2009-10-27 14:31:10,751 INFO [geonetwork.index] - Begin - Spatial Indexing
2009-10-27 14:31:10,782 INFO [geonetwork.index] - END
In most cases Lucene indexing is taking more than 30 seconds
Lucene index filesize: _1rtg.cfs -> 195MB
thanks for your help
Juan Carlos Méndez
---------- Forwarded message ----------
From: James Wilson <[hidden email]>
To: [hidden email]
Date: Mon, 26 Oct 2009 02:06:24 -0700 (PDT)
Subject: Re: [GeoNetwork-devel] Lucene: Faster indexing
Is the problem within Lucene, or within the shapefile that GeoNetwork creates
using geotools? In my (limited) experience with shapefiles / geotools,
adding to a shapefile using a transaction gets progressively slower as the
number of features in the shapefile grows. I believe geotools copies the
file to a temporary location, then reindexes whole file. This should scale
worse than linearly, but I'm guessing not much worse than N log N.
It might be worth trying to insert some logging info into SearchManager to
push out timings for different bits of the indexing processing.
Interested to see how you get on with that volume of records.
James
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference_______________________________________________
GeoNetwork-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/geonetwork-develGeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork