Re: [S] extraction from big datafile

Michael Bramley, M.Sc. (lifer@fuse.net)
Mon, 27 Apr 1998 18:52:10 -0400


Hello,

If this a text file, SAS file?... the source will invariably affect your
platform.

If it's SAS....
The sas.get() function will allow you to target specific variables and
subset the
file, eg: sas.get( lib, file, var=c("lat","long"), ifs=c("subsetting if1",
subsetting if2", etc))

If it's text, then read.table() might do the trick. You can specify which
fields to read in
using col.names= (ahh, I don't believe I said that--127MB would take days
to process)
The row.names= argument **might** be used to subset the data, but I've
never tried it.

With this much data, I would want to load into in a real production
platform first, then
subset it down to just what you waht to port to S+.

But that's my opinion,
Michael
----------
> From: srosenfeld@nesdis.noaa.gov
> To: s-news@cs.wisc.edu
> Subject: [S] extraction from big datafile
> Date: Monday, April 27, 1998 12:34 PM
>
> Dear S'ers:
>
> I have a 127MB file containing about 2,000,000 lines of the following
> type:
>
> julian.day latitude longitude TB1 TB2 TB3.......
>
> I need to extract and to store as an S+ object the lines related to
specific
> list of locations (i.e. pairs lat&lon)
>
> Normally, I do this kind of work in FORTRAN. Is there any good way
(comparable
> in speed) to perform this kind of extraction in s+?. I work with S+ 3.3,
> pent/166/32MGB RAM.
>
> Thank you
>
> Simon Rosenfeld
> NOAA Science Center,
> NESDIS/Satellite Research Lab
> Camp Springs, MD
> -----------------------------------------------------------------------
> This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
> message: unsubscribe s-news
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news