Loading a point-vector table with 466 columns

29 messages Options
Embed this post
Permalink
1 2
Nikos Alexandris

Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
Hi R-statisticians :-)

I am trying to load a BIG table (=466 columns, 875 rows) of a grass
point-vector map in R. It takes too long to respond (as expected of
course)...

* Is there any "limitation" in spgrass6 regarding the column numbers?
* Can I somehow know that the data are fed into R and I am not wasting
my time?
* Should I prefer to export as a csv the table and work in an
out-of-GRASS R session?

Kind regards, Nikos

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Nikos Alexandris

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink

Nikos:
> Hi R-statisticians :-)
>
> I am trying to load a BIG table (=466 columns, 875 rows) of a grass
> point-vector map in R. It takes too long to respond (as expected of
> course)...
>
> * Is there any "limitation" in spgrass6 regarding the column numbers?
> * Can I somehow know that the data are fed into R and I am not wasting
> my time?

###

> * Should I prefer to export as a csv the table and work in an
> out-of-GRASS R session?

This works perfectly. So, why loading the grass attribute table takes so
long (if it actually works)?

Thanks, Nikos

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Dylan Beaudette-2

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
Could be a DBF-related problem.

Cheers,
Dylan

On Wed, May 20, 2009 at 6:00 PM, Nikos Alexandris
<[hidden email]> wrote:

>
> Nikos:
>> Hi R-statisticians :-)
>>
>> I am trying to load a BIG table (=466 columns, 875 rows) of a grass
>> point-vector map in R. It takes too long to respond (as expected of
>> course)...
>>
>> * Is there any "limitation" in spgrass6 regarding the column numbers?
>> * Can I somehow know that the data are fed into R and I am not wasting
>> my time?
>
> ###
>
>> * Should I prefer to export as a csv the table and work in an
>> out-of-GRASS R session?
>
> This works perfectly. So, why loading the grass attribute table takes so
> long (if it actually works)?
>
> Thanks, Nikos
>
> _______________________________________________
> grass-stats mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/grass-stats
>
_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Nikos Alexandris

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink

Dylan:
> Could be a DBF-related problem.

Dylan,

I am using sqlite as DB backend. How does this affect the import via
readVECT6?

Nikos

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Dylan Beaudette-2

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
I use that back-end as well. I am not sure if readVECT6() still uses
shapefiles as an intermediate file...

Dylan

On Fri, May 22, 2009 at 4:02 AM, Nikos Alexandris
<[hidden email]> wrote:

>
> Dylan:
>> Could be a DBF-related problem.
>
> Dylan,
>
> I am using sqlite as DB backend. How does this affect the import via
> readVECT6?
>
> Nikos
>
>
_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
On Fri, 22 May 2009, Dylan Beaudette wrote:

> I use that back-end as well. I am not sure if readVECT6() still uses
> shapefiles as an intermediate file...

Yes, it does. If, however, you have a working GRASS/OGR plugin, it may be
that readOGR() (inside readVECT6()) will use it and transfer the data
directly. To diagnose, one would need to look at readVECT() and readOGR()
running step-by-step to see whether the problem is v.out.ogr or readOGR(),
or possibly some locale-dependent wrinkle in the DBF.

Nikos - if you'd like to contribute well-proven patches, you are most
welcome, until then, I'm not motivated by your complaints to do anything
at all. Speed is a secondary issue - primary is just having things work,
and handling the myriad variants of GRASS on different platforms is hard
enough. OK?

Roger

>
> Dylan
>
> On Fri, May 22, 2009 at 4:02 AM, Nikos Alexandris
> <[hidden email]> wrote:
>>
>> Dylan:
>>> Could be a DBF-related problem.
>>
>> Dylan,
>>
>> I am using sqlite as DB backend. How does this affect the import via
>> readVECT6?
>>
>> Nikos
>>
>>
> _______________________________________________
> grass-stats mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/grass-stats
>

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [hidden email]

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand
Economic Geography Section
Department of Economics
Norwegian School of Economics and Business Administration
Helleveien 30
N-5045 Bergen, Norway
Nikos Alexandris

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink

Dylan:
> > I use that back-end as well. I am not sure if readVECT6() still uses
> > shapefiles as an intermediate file...

Roger:
> Yes, it does. If, however, you have a working GRASS/OGR plugin, it may be
> that readOGR() (inside readVECT6()) will use it and transfer the data
> directly. To diagnose, one would need to look at readVECT() and readOGR()
> running step-by-step to see whether the problem is v.out.ogr or readOGR(),
> or possibly some locale-dependent wrinkle in the DBF.

> Nikos - if you'd like to contribute well-proven patches, you are most
> welcome, until then, I'm not motivated by your complaints to do anything
> at all. Speed is a secondary issue - primary is just having things work,
> and handling the myriad variants of GRASS on different platforms is hard
> enough. OK?

I absolutely agree Roger. Of course, I am no programmer and I can only
"complaint" whenever I find (or I think that I find) something not
working. Please, if my complaints lie within your interests, I kindly
ask for "pointers" how I could _at least_ provide better (more
convincing?) "complaints" ;-)

Thanks for the feedback in any case. Nikos

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
hamish-2

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
In reply to this post by Nikos Alexandris

> Roger:
> Speed is a secondary issue - primary is just having things work,
> and handling the myriad variants of GRASS on different platforms
> is hard enough.

very true, and shapefile+DBF is probably (after the native vector format)
the second best tested. But at the same time shapefiles/DBF is a somewhat
lossy format and so no panacea.


regards,
Hamish



     

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Nikos Alexandris

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
In reply to this post by Nikos Alexandris

Nikos:
> > I am trying to load a BIG table (=466 columns, 875 rows) of a grass
> > point-vector map in R. It takes too long to respond (as expected of
> > course)...

OK, I let it run and it worked. BUT, I forgot to use some kind of time
measuring (I read somewhere that it's pretty easy to do so in R).

I only know that it took more than 30 mins which, comparing to some
seconds of reading the .csv file is far away from "working ok". I can
let it run again if I find that time-measuring function again.


Cheers, Nikos
---

# the command and the response (just for the records)
# running R from within a GRASS location

# loading "spgrass6"
library(spgrass6) ; G <- gmeta6()

# read the attribute table
sample_2 <- readVECT6("sample_2_grid_points")

OGR data source with driver: GRASS
Source:
"/geo/grassdb/peloponnese/evaluation_utm/nik/vector/sample_2_grid_points/head", layer: "1"
with  875  rows and  466  columns
Feature type: wkbPoint with 3 dimensions

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Nikos Alexandris

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
Just FYI

# almost an hour...
Sys.time() ; sample_2 <- readVECT6("sample_2_grid_points") ; Sys.time()
[1] "2009-05-22 23:25:02 CEST"
OGR data source with driver: GRASS
Source:
"/geo/grassdb/peloponnese/evaluation_utm/nik/vector/sample_2_grid_points/head", layer: "1"
with  875  rows and  466  columns

Feature type: wkbPoint with 3 dimensions
[1] "2009-05-23 00:22:12 CEST"


# while reading the csv...
> Sys.time() ; sample_2 <-
read.csv(file="sample_2_grid_points_table.csv") ; Sys.time()
[1] "2009-05-23 01:39:51 CEST"
[1] "2009-05-23 01:39:52 CEST"


Kindest regards, Nikos

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
On Sat, 23 May 2009, Nikos Alexandris wrote:

> Just FYI
>
> # almost an hour...
> Sys.time() ; sample_2 <- readVECT6("sample_2_grid_points") ; Sys.time()
> [1] "2009-05-22 23:25:02 CEST"
> OGR data source with driver: GRASS
> Source:
> "/geo/grassdb/peloponnese/evaluation_utm/nik/vector/sample_2_grid_points/head", layer: "1"
> with  875  rows and  466  columns
>
> Feature type: wkbPoint with 3 dimensions
> [1] "2009-05-23 00:22:12 CEST"

Thanks, this is helpful. As you can see, the work is being done by the OGR
GRASS driver, not by writing a shapefile with v.out.ogr and reading that
shapefile with the shapefile driver in OGR. Since these are just points,
the simplest geometry, something is making the driver run slowly.

Does plugin=FALSE speed it up or slow it down (that would force the use of
a temporary shapefile)?

Does anyone know of interactions between SQLite storage and the OGR GRASS
driver?

By the way, just put the command inside system.time() to time it.

>
>
> # while reading the csv...
>> Sys.time() ; sample_2 <-
> read.csv(file="sample_2_grid_points_table.csv") ; Sys.time()
> [1] "2009-05-23 01:39:51 CEST"
> [1] "2009-05-23 01:39:52 CEST"
>

This is not a fair comparison, because you have to dump the CSV file from
the GRASS database first, although it won't take long. What are you using
to do that? Have you considered connecting to the SQLite file directly
from R? Are the (2) coordinates present in the table? See:

http://cran.r-project.org/web/packages/RSQLite/index.html

for direct reading.

Roger

>
> Kindest regards, Nikos
>
> _______________________________________________
> grass-stats mailing list
> [hidden email]
> http://lists.osgeo.org/mailman/listinfo/grass-stats
>

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [hidden email]

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand
Economic Geography Section
Department of Economics
Norwegian School of Economics and Business Administration
Helleveien 30
N-5045 Bergen, Norway
Nikos Alexandris

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink

Nikos:
> > # almost an hour...
> > Sys.time() ; sample_2 <- readVECT6("sample_2_grid_points") ; Sys.time()
> > [1] "2009-05-22 23:25:02 CEST"
> > OGR data source with driver: GRASS
> > Source:
> > "/geo/grassdb/peloponnese/evaluation_utm/nik/vector/sample_2_grid_points/head", layer: "1"
> > with  875  rows and  466  columns

> > Feature type: wkbPoint with 3 dimensions
> > [1] "2009-05-23 00:22:12 CEST"

Roger:
--%<---
> Does plugin=FALSE speed it up or slow it down (that would force the use of
> a temporary shapefile)?

Yes, it speeds up.

# with "plugin=FALSE"
system.time(readVECT6("sample_2_grid_points", plugin=FALSE))
Exporting 875 points/lines...
 100%
875 features written
OGR data source with driver: ESRI Shapefile
Source: "/geo/grassdb/peloponnese/evaluation_utm/nik/.tmp/vertical",
layer: "sample_2"
with  875  rows and  466  columns
Feature type: wkbPoint with 2 dimensions
   user  system elapsed
169.450  24.677 204.882


## there is one difference: wkbPoint with "3" vs "2" dimensions ##
## what does this mean (wkbPoint)? OK, I look for it in the book ##


> > # while reading the csv...
> > Sys.time() ; sample_2 <-
> > read.csv(file="sample_2_grid_points_table.csv") ; Sys.time()
> > [1] "2009-05-23 01:39:51 CEST"
> > [1] "2009-05-23 01:39:52 CEST"

--%<---
> This is not a fair comparison, because you have to dump the CSV file from
> the GRASS database first, although it won't take long. What are you using
> to do that?

# right, it takes some time (<1min)
# running from within GRASS location
time db.out.ogr in=sample_2_grid_points
dsn=/geo/grassdb/peloponnese/R/R_files/sample_2_grid_points_table
format=CSV
Exported table
</geo/grassdb/peloponnese/R/R_files/sample_2_grid_points_table.csv>

real 0m46.845s
user 0m22.065s
sys 0m23.637s

> Have you considered connecting to the SQLite file directly
> from R? Are the (2) coordinates present in the table? See:
>
> http://cran.r-project.org/web/packages/RSQLite/index.html
>
> for direct reading.

I was not aware of RSQLite. If it's straight-forward I'll try it today.
If you mean the x, y coordinates just as normal columns, no, I don't
require them currently.


Overview of loading grass attrubute table (875 rows, 466 colummns) via:

* readVECT6() with plugin=TRUE                         : ~57min
* readVECT6() with plugin=FALSE                        : ~3min+
* export from grass as CSV (~46sec) + read.csv (1 sec) : ~47sec

Nikos

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
On Sat, 23 May 2009, Nikos Alexandris wrote:

>
> Nikos:
>>> # almost an hour...
>>> Sys.time() ; sample_2 <- readVECT6("sample_2_grid_points") ; Sys.time()
>>> [1] "2009-05-22 23:25:02 CEST"
>>> OGR data source with driver: GRASS
>>> Source:
>>> "/geo/grassdb/peloponnese/evaluation_utm/nik/vector/sample_2_grid_points/head", layer: "1"
>>> with  875  rows and  466  columns
>
>>> Feature type: wkbPoint with 3 dimensions
>>> [1] "2009-05-23 00:22:12 CEST"
>
> Roger:
> --%<---
>> Does plugin=FALSE speed it up or slow it down (that would force the use of
>> a temporary shapefile)?
>
> Yes, it speeds up.
>
> # with "plugin=FALSE"
> system.time(readVECT6("sample_2_grid_points", plugin=FALSE))
> Exporting 875 points/lines...
> 100%
> 875 features written
> OGR data source with driver: ESRI Shapefile
> Source: "/geo/grassdb/peloponnese/evaluation_utm/nik/.tmp/vertical",
> layer: "sample_2"
> with  875  rows and  466  columns
> Feature type: wkbPoint with 2 dimensions
>   user  system elapsed
> 169.450  24.677 204.882
>
>
> ## there is one difference: wkbPoint with "3" vs "2" dimensions ##
> ## what does this mean (wkbPoint)? OK, I look for it in the book ##
>

Three minutes instead of thirty+ suggests that the OGR plugin has trouble
with SQLite as the DB format. So maybe the default for plugin= should be
FALSE, not NULL and automatic use if present?

The plugin also creates a fictitious third dimension in (point at least)
data that has created havoc, and has led to readVECT6() getting a
pointDropZ= argument - that's why it says that wkbPoint is 3 with the
plugin and (correctly) 2 otherwise.

>
>>> # while reading the csv...
>>> Sys.time() ; sample_2 <-
>>> read.csv(file="sample_2_grid_points_table.csv") ; Sys.time()
>>> [1] "2009-05-23 01:39:51 CEST"
>>> [1] "2009-05-23 01:39:52 CEST"
>
> --%<---
>> This is not a fair comparison, because you have to dump the CSV file from
>> the GRASS database first, although it won't take long. What are you using
>> to do that?
>
> # right, it takes some time (<1min)
> # running from within GRASS location
> time db.out.ogr in=sample_2_grid_points
> dsn=/geo/grassdb/peloponnese/R/R_files/sample_2_grid_points_table
> format=CSV
> Exported table
> </geo/grassdb/peloponnese/R/R_files/sample_2_grid_points_table.csv>
>
> real 0m46.845s
> user 0m22.065s
> sys 0m23.637s

OK, thanks, this mirrors part of the v.out.ogr timing in the three
minutes.

Roger


>
>> Have you considered connecting to the SQLite file directly
>> from R? Are the (2) coordinates present in the table? See:
>>
>> http://cran.r-project.org/web/packages/RSQLite/index.html
>>
>> for direct reading.
>
> I was not aware of RSQLite. If it's straight-forward I'll try it today.
> If you mean the x, y coordinates just as normal columns, no, I don't
> require them currently.
>
>
> Overview of loading grass attrubute table (875 rows, 466 colummns) via:
>
> * readVECT6() with plugin=TRUE                         : ~57min
> * readVECT6() with plugin=FALSE                        : ~3min+
> * export from grass as CSV (~46sec) + read.csv (1 sec) : ~47sec
>
> Nikos
>
>

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [hidden email]

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand
Economic Geography Section
Department of Economics
Norwegian School of Economics and Business Administration
Helleveien 30
N-5045 Bergen, Norway
hamish-2

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
In reply to this post by Nikos Alexandris

Roger wrote:
> Three minutes instead of thirty+ suggests that the OGR
> plugin has trouble with SQLite as the DB format. So maybe
> the default for plugin= should be FALSE, not NULL and automatic
> use if present?

better: if cause for slowdown is isolated (or at least
reproducable) please file a bug report to get the plugin fixed.
That way e.g. qgis and others who use the plugin also get the
speedup.

> The plugin also creates a fictitious third dimension in
> (point at least) data that has created havoc, and has led
> to readVECT6() getting a pointDropZ= argument - that's why it
> says that wkbPoint is 3 with the plugin and (correctly) 2
> otherwise.

ditto.


Hamish



     

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
On Sat, 23 May 2009, Hamish wrote:

>
> Roger wrote:
>> Three minutes instead of thirty+ suggests that the OGR
>> plugin has trouble with SQLite as the DB format. So maybe
>> the default for plugin= should be FALSE, not NULL and automatic
>> use if present?
>
> better: if cause for slowdown is isolated (or at least
> reproducable) please file a bug report to get the plugin fixed.
> That way e.g. qgis and others who use the plugin also get the
> speedup.

I'm not running SQLite, nor do I have a "wide" table. Could you, Nikos,
make a script generating a similar table in spearfish, and two small
scripts exercising the problem (export to R with the plugin, and with the
temporary shapefile.

>
>> The plugin also creates a fictitious third dimension in
>> (point at least) data that has created havoc, and has led
>> to readVECT6() getting a pointDropZ= argument - that's why it
>> says that wkbPoint is 3 with the plugin and (correctly) 2
>> otherwise.
>
> ditto.

OK, looks like calls to Vect_is_3d(poMap) missing in
*OGRGRASSLayer::GetFeatureGeometry about lines after 851 in
ogf_frmts/grass/ogrgrasslayer.cpp; the problem exists for all geometry
types, just emitting a z value even if the vect is 2D. I can't see that
Vect_read_line returns z as NULL if not 3D - Vect_is_3d() is not used much
in lib/vector/*. I cannot get into the osgeo trac - could you, Hamish,
help and enter a bug in the GDAL/GRASS or whichever trac stream is
relevant?

Roger

>
>
> Hamish
>
>
>
>
>
>

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [hidden email]

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand
Economic Geography Section
Department of Economics
Norwegian School of Economics and Business Administration
Helleveien 30
N-5045 Bergen, Norway
Nikos Alexandris

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink

--%<--
Roger wrote:
> I'm not running SQLite, nor do I have a "wide" table. Could you, Nikos,
> make a script generating a similar table in spearfish, and two small
> scripts exercising the problem (export to R with the plugin, and with the
> temporary shapefile.

I'll try my best.

Kind regards, Nikos


_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
hamish-2

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
In reply to this post by Nikos Alexandris

Roger wrote:
> >> The plugin also creates a fictitious third dimension in
> >> (point at least) data that has created havoc, and has led
> >> to readVECT6() getting a pointDropZ= argument - that's why it
> >> says that wkbPoint is 3 with the plugin and (correctly) 2
> >> otherwise.
...
> OK, looks like calls to Vect_is_3d(poMap) missing in
> *OGRGRASSLayer::GetFeatureGeometry about lines after 851 in
> ogf_frmts/grass/ogrgrasslayer.cpp; the problem exists for
> all geometry types, just emitting a z value even if the vect is
> 2D. I can't see that Vect_read_line returns z as NULL if not 3D -
> Vect_is_3d() is not used much in lib/vector/*.

filed as gdalbug #3009:  https://trac.osgeo.org/gdal/ticket/3009
Roger I added you in cc there but suggest to change it to your osgeoid
before the spam scanners find your email addr.


> I cannot get into the osgeo trac -

yeah, it has been having troubles lately, just give it 5 minutes and
try again.  see http://thread.gmane.org/gmane.comp.gis.grass.devel/33404


Hamish



     

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Markus Neteler

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
On Mon, May 25, 2009 at 8:08 AM, Hamish <[hidden email]> wrote:
> Roger wrote:
...
> filed as gdalbug #3009:  https://trac.osgeo.org/gdal/ticket/3009
> Roger I added you in cc there but suggest to change it to your osgeoid
> before the spam scanners find your email addr.

This can be used to look up osgeo-IDs:
 http://www.osgeo.org/cgi-bin/ldap_web_search.py

 I have changed it already for Roger. Also added grass-dev in CC.

Markus
_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Markus Neteler

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
In reply to this post by hamish-2
On Mon, May 25, 2009 at 8:08 AM, Hamish <[hidden email]> wrote:

>
> Roger wrote:
>> >> The plugin also creates a fictitious third dimension in
>> >> (point at least) data that has created havoc, and has led
>> >> to readVECT6() getting a pointDropZ= argument - that's why it
>> >> says that wkbPoint is 3 with the plugin and (correctly) 2
>> >> otherwise.
> ...
>> OK, looks like calls to Vect_is_3d(poMap) missing in
>> *OGRGRASSLayer::GetFeatureGeometry about lines after 851 in
>> ogf_frmts/grass/ogrgrasslayer.cpp; the problem exists for
>> all geometry types, just emitting a z value even if the vect is
>> 2D. I can't see that Vect_read_line returns z as NULL if not 3D -
>> Vect_is_3d() is not used much in lib/vector/*.
>
> filed as gdalbug #3009:  https://trac.osgeo.org/gdal/ticket/3009

According to GDAL-trac now fixed:

On Mon, May 25, 2009 at 10:50 PM, GDAL <[hidden email]> wrote:

> #3009: GRASS OGR driver: 2D maps read as 3D
> --------------------+-------------------------------------------------------
>  Reporter:  hamish  |        Owner:  rouault
>     Type:  defect  |       Status:  closed
>  Priority:  normal  |    Milestone:  1.7.0
> Component:  OGR_SF  |      Version:  unspecified
>  Severity:  normal  |   Resolution:  fixed
>  Keywords:  grass   |
> --------------------+-------------------------------------------------------
> Changes (by rouault):
>
>  * status:  assigned => closed
>  * resolution:  => fixed
>  * milestone:  => 1.7.0
>
> Comment:
>
>  For the 2D geometries reported as 3D, I've commited a fix in trunk in
>  r17122.
>
> --
> Ticket URL: <http://trac.osgeo.org/gdal/ticket/3009#comment:4>
> GDAL <http://trac.osgeo.org/gdal/>
> Geospatial Data Abstraction Library is a translator library for raster and vector geospatial data formats.


Kudos to Even Rouault (and the bug hunters here of course),
Markus
_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand

Re: Loading a point-vector table with 466 columns

Reply Threaded More More options
Print post
Permalink
In reply to this post by Nikos Alexandris
On Mon, 25 May 2009, Even Rouault wrote:

> (Sorry, I'm not subscribed to the GRASS-stats list, so I've emailed directly
> to the people having taken part to the discussion)

Hi Even,

Thanks for tackling the 2D/3D issue in the GRASS read driver - the fixes
look appropriate.

The SQLite issue is, I believe, on the GRASS side - GRASS can use a number
of databases for storing attribute data, but geometries are stored
natively. This means that GRASS is using its own SQLite handling, I think
shown in OGRGRASSLayer::SetAttributes() in the file you've seen in the
2D/3D fix for determining their types, and in OGRGRASSLayer::GetFeature()
for copying them from GRASS structures to OGR structures. I think that the
OGR driver is limited by GRASS's own driver, but since I don't use SQLite,
I cannot reproduce the problem.

I'll add grass-stats to the CC for completeness.

Roger

>
> About the OGR sqlite performance, I've not read all the posts of the thread,
> but I couldn't reproduce any performance related problem.
>
> Here's a simple OGR python script that creates a 1000 row x 1000 column sqlite
> db :
>
> #!/usr/bin/python
> import ogr
>
> ds = ogr.GetDriverByName('SQLite').CreateDataSource('testhuge.db')
> lyr = ds.CreateLayer('test')
> for i in range(1000):
>    ds.ExecuteSQL("ALTER TABLE test ADD COLUMN att%d VARCHAR" % i)
> ds = None
>
> ds = ogr.Open('testhuge.db', update = 1)
> lyr = ds.GetLayerByName('test')
> for i in range(1000):
>    feat = ogr.Feature(lyr.GetLayerDefn())
>    for j in range(1000):
>        feat.SetField(j, 'val%d' % j)
>    lyr.CreateFeature(feat)
>
> ds = None
>
> (takes a few seconds to run)
>
>
> Then "time ogrinfo -ro -al -q testhuge.db  > /dev/null"
>
> --> real 0m3.903s
>
> So 4 seconds, not 30 minutes... Maybe it's due to the fact how GRASS uses the
> OGR output ?
>
> Actually, by reviewing the code, there might *maybe* a performance problem if
> the FID or geometry columns of the table were at the end of the column list
> (see lines 308-313 and 339-344 in ogr/ogrsf_frmts/sqlite/ogrsqlitelayer.cpp,
> latest GDAL SVN head trunk), that could cause loop over each column name
> before finding them. But I couldn't really evaluate the impact it could have.
> (by default OGR creates these fields as the first ones)
>
> Even
>
>

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [hidden email]

_______________________________________________
grass-stats mailing list
[hidden email]
http://lists.osgeo.org/mailman/listinfo/grass-stats
Roger Bivand
Economic Geography Section
Department of Economics
Norwegian School of Economics and Business Administration
Helleveien 30
N-5045 Bergen, Norway
1 2