re: [S] warning on the use of unpaste

Scott.Chasalow@USERS.PV.WAU.NL
Fri, 23 Oct 1998 11:35:29 +0200


On Thu Oct 22 19:09:10 1998,
""William Q. Meeker"" <wqmeeker@iastate.edu> wrote:
>Now that unpaste() has been uncovered, users should be warned about
>
>unpaste(NULL)
>
>and
>
>unpaste(numeric(0))
>
>either of which will suddenly and unpleasantly kill your S-Plus
>session ...

Yes, thanks for the warning. On my S-plus 3.3 for Windows, sometimes
this works and sometimes it kills S-plus. The problem is in a C function
called by unpaste. I suspect a zero-length string causes improper
memory allocation and it ends up trying to access memory it shouldn't.
Whether or not that causes a problem (such as S-death) varies depending
on for what that memory is being used.

This brings up an important more general point which is perhaps worth
repeating: undocumented S functions must always be used with rather
more care than the standard documented ones. They are likely to be
less bullet-proofed, which is reasonable when their authors intended
them only for a very specific purpose for which they knew a lot about the
possible input (perhaps by screening it in calling functions).

In this case, unpaste() is quite unrobust to various sorts of pathological
inputs we evil users might wish to pass it. Witness:

> unpaste("")
Error in tabulate: Missing value where logical needed: if(nbins < max(bin))
stop
("nbins must be at least as large as max(bin)")
. . .
Dumped

> unpaste(c("Hi there", "bye now"), sep = " ") # This works fine
[[1]]:
[1] "Hi" "bye"

[[2]]:
[1] "there" "now"

> unpaste(c("Hi there", ""), sep = " ") # This is not so nice
[[1]]:
[1] "Hi" "NA"

[[2]]:
[1] "there" "NA"

# And god forbid the strings should have different numbers of fields!
> unpaste(c("Hi there !!", "bye now"), sep = " ")
Error in rep.int: rep() only defined for length(times)==1 or length(x):
rep.int(
1:xlen, times)
. . .
Dumped

Finally, Mark Bravington makes a good point that unpaste() can be slow
compared to a more direct use of substring() for specific applications.
This is not surprising. Although it lacks bullet-proofing, unpaste() does
a lot of computations to be rather more general than needed in Mark's
example. His Moral - don't do what I did and rewrite your code to avoid
"substring" in favour of "unpaste" - is well taken. But if it does what you
want for one-off jobs and interactive use, go right ahead.

Cheers,
Scott
=========================================
Scott.Chasalow@users.pv.wau.nl

Wageningen Agricultural University
Laboratory of Plant Breeding
P.O. Box 386
6700 AJ Wageningen
THE NETHERLANDS

http://www.spg.wau.nl/pv/staff/Chasal_S.htm
==========================================

-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news