unexpected behavior of IO.read_line

13 messages Options
Embed this post
Permalink
Marco Trudel

unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
Dear all

IO.read_line overwrites the content of IO.last_string "instead of"
creating a new reference. This is kind of hard to figure out as regular
user... For instance this snippet:

     failing_string_example
         local
             s1, s2: STRING
         do
             IO.read_line
             s1 := IO.last_string
             IO.read_line
             s2 := IO.last_string
             IO.put_string (s1 + "%N")
             IO.put_string (s2 + "%N")
         end

Will have the last user input in s1 and s2.
Is this behavior intended? What do you guys think about changing it?


Thanks and have a nice day
Marco
Peter Horan

RE: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
The behaviour you report is intended. It has been like that for close to 20 years if not longer. Newcomers are frequently troubled as you are.

By rewriting `last_string' each time rather than creating a new string, the decision of creating a new object is left to the user.

For example, if you read a text file of 1000 lines, do you want to create 1000 strings or one string?

separate_string_example
      -- Use `twin' to create new strings
   local
      s1, s2: STRING
   do
      io.read_line
      s1 := io.last_string.twin
      io.read_line
      s2 := io.last_string.twin
      io.put_string (s1 + "%N")
      io.put_string (s2 + "%N")
   end

single_string_example
      -- Use `twin' to create one new string and
      -- append a subsequent string
      -- replacing newline by space
   local
      s1: STRING
   do
      io.read_line
      s1 := io.last_string.twin
      io.read_line
      s1.append_character(' ')
      s1.append(io.last_string)
      io.put_string (s1 + "%N")
   end
--
Peter Horan             Faculty of Science and Technology
[hidden email]     Deakin University
+61-3-5221 1234 (Voice) Geelong, Victoria 3217, AUSTRALIA
+61-4-0831 2116 (Mobile)

-- The Eiffel guarantee: From specification to implementation
-- (http://www.cetus-links.org/oo_eiffel.html)

Bernd Schoeller-3

Re: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
The idea of 'last_string' to be a buffer is broken (it is just the wrong
word).

The (IMHO) right decision would be to create rename 'last_string' to
'read_buffer' and then to have 'last_string' be the twin of 'read_buffer'.

I think we would not loose much in changing it (much less than the
changes that are currently running through the Eiffel world with Void
safetly). Some code would be come less efficient, but in 99% of the
cases it would not create problems. If it does, replace 'last_string' by
'read_buffer' is easy and the code becomes more readable.

<irony>
On the other hand, 20 years of tradition seems to be a good reason to
keep broken code.
</irony>

Bernd

On 11/3/09 2:23 PM, Peter Horan wrote:

>  
>
> The behaviour you report is intended. It has been like that for close to
> 20 years if not longer. Newcomers are frequently troubled as you are.
>
> By rewriting `last_string' each time rather than creating a new string,
> the decision of creating a new object is left to the user.
>
> For example, if you read a text file of 1000 lines, do you want to
> create 1000 strings or one string?
>
> separate_string_example
> -- Use `twin' to create new strings
> local
> s1, s2: STRING
> do
> io.read_line
> s1 := io.last_string.twin
> io.read_line
> s2 := io.last_string.twin
> io.put_string (s1 + "%N")
> io.put_string (s2 + "%N")
> end
>
> single_string_example
> -- Use `twin' to create one new string and
> -- append a subsequent string
> -- replacing newline by space
> local
> s1: STRING
> do
> io.read_line
> s1 := io.last_string.twin
> io.read_line
> s1.append_character(' ')
> s1.append(io.last_string)
> io.put_string (s1 + "%N")
> end
> --
> Peter Horan             Faculty of Science and Technology
> [hidden email] <mailto:peter%40deakin.edu.au>     Deakin University
> +61-3-5221 1234 (Voice) Geelong, Victoria 3217, AUSTRALIA
> +61-4-0831 2116 (Mobile)
>
> -- The Eiffel guarantee: From specification to implementation
> -- (http://www.cetus-links.org/oo_eiffel.html
> <http://www.cetus-links.org/oo_eiffel.html>)
>
>


--
Bernd Schoeller, PhD, CTO, Partner
Comerge AG, Bubenbergstrasse 11, CH-8045 Zurich, www.comerge.net
rfo

RE: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
In reply to this post by Marco Trudel
Hi Bernd

Actually, the irony comes from wanting to change the semantics of
last_string, but not the "traditional" name.  We should keep the name
and semantics as they are, but provide a twin-ing function as an
alternative (hardly a techno stretch).  No breaks, no whining, etc.  All
that needs changing then (versus creating anew) is the documentation
that says how to use it.  All old code (and old programmers) can remain
as they were.  The original goal of leaving (potentially wasteful)
object creation to the user's discretion remains, so those who adore
copy semantics can have their way just as much as those who worship
reference semantics.

last_string_read: STRING
  do
    Result := last_string.twin
  end

No big deal;  No endless committee meetings, no broken code, no angst.

     R

==================================================
Roger F. Osmond


> -------- Original Message --------
> Subject: Re: [eiffel_software] unexpected behavior of IO.read_line
> From: Bernd Schoeller <[hidden email]>
> Date: Tue, November 03, 2009 1:24 pm
> To: [hidden email]
> The idea of 'last_string' to be a buffer is broken (it is just the wrong
> word).
> The (IMHO) right decision would be to create rename 'last_string' to
> 'read_buffer' and then to have 'last_string' be the twin of 'read_buffer'.
> I think we would not loose much in changing it (much less than the
> changes that are currently running through the Eiffel world with Void
> safetly). Some code would be come less efficient, but in 99% of the
> cases it would not create problems. If it does, replace 'last_string' by
> 'read_buffer' is easy and the code becomes more readable.
> <irony>
> On the other hand, 20 years of tradition seems to be a good reason to
> keep broken code.
> </irony>
> Bernd
> On 11/3/09 2:23 PM, Peter Horan wrote:
> >  
> >
> > The behaviour you report is intended. It has been like that for close to
> > 20 years if not longer. Newcomers are frequently troubled as you are.
> >
> > By rewriting `last_string' each time rather than creating a new string,
> > the decision of creating a new object is left to the user.
> >
> > For example, if you read a text file of 1000 lines, do you want to
> > create 1000 strings or one string?
> >
> > separate_string_example
> > -- Use `twin' to create new strings
> > local
> > s1, s2: STRING
> > do
> > io.read_line
> > s1 := io.last_string.twin
> > io.read_line
> > s2 := io.last_string.twin
> > io.put_string (s1 + "%N")
> > io.put_string (s2 + "%N")
> > end
> >
> > single_string_example
> > -- Use `twin' to create one new string and
> > -- append a subsequent string
> > -- replacing newline by space
> > local
> > s1: STRING
> > do
> > io.read_line
> > s1 := io.last_string.twin
> > io.read_line
> > s1.append_character(' ')
> > s1.append(io.last_string)
> > io.put_string (s1 + "%N")
> > end
> > --
> > Peter Horan             Faculty of Science and Technology
> > [hidden email] <mailto:peter%40deakin.edu.au>     Deakin University
> > +61-3-5221 1234 (Voice) Geelong, Victoria 3217, AUSTRALIA
> > +61-4-0831 2116 (Mobile)
> >
> > -- The Eiffel guarantee: From specification to implementation
> > -- (http://www.cetus-links.org/oo_eiffel.html
> > <http://www.cetus-links.org/oo_eiffel.html>)
> >
> >
> --
> Bernd Schoeller, PhD, CTO, Partner
> Comerge AG, Bubenbergstrasse 11, CH-8045 Zurich, www.comerge.net

panfriedwoggle

Re: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
In reply to this post by Peter Horan
Is this not also an example of the principle of Command-Query Separation?

--- In [hidden email], Peter Horan <peter.horan@...> wrote:

>
> The behaviour you report is intended. It has been like that for close to 20 years if not longer. Newcomers are frequently troubled as you are.
>
> By rewriting `last_string' each time rather than creating a new string, the decision of creating a new object is left to the user.
>
> For example, if you read a text file of 1000 lines, do you want to create 1000 strings or one string?
>
> separate_string_example
>       -- Use `twin' to create new strings
>    local
>       s1, s2: STRING
>    do
>       io.read_line
>       s1 := io.last_string.twin
>       io.read_line
>       s2 := io.last_string.twin
>       io.put_string (s1 + "%N")
>       io.put_string (s2 + "%N")
>    end
>
> single_string_example
>       -- Use `twin' to create one new string and
>       -- append a subsequent string
>       -- replacing newline by space
>    local
>       s1: STRING
>    do
>       io.read_line
>       s1 := io.last_string.twin
>       io.read_line
>       s1.append_character(' ')
>       s1.append(io.last_string)
>       io.put_string (s1 + "%N")
>    end
> --
> Peter Horan             Faculty of Science and Technology
> peter@...     Deakin University
> +61-3-5221 1234 (Voice) Geelong, Victoria 3217, AUSTRALIA
> +61-4-0831 2116 (Mobile)
>
> -- The Eiffel guarantee: From specification to implementation
> -- (http://www.cetus-links.org/oo_eiffel.html)
>


Peter Gummer-2

Re: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
In reply to this post by rfo
Roger Osmond wrote:

> last_string_read: STRING
>  do
>    Result := last_string.twin
>  end
>
> No big deal;  No endless committee meetings, no broken code, no angst.


I don't think this fixes the problem. New users of Eiffel still won't  
easily know that last_string should not be used for the purpose that  
its name suggests.

Bernd's suggestion sounds sensible to me. Changing last_string to  
return a copy of the buffer won't break any existing code, other than  
making it run more slowly. If the slowness matters, people will notice  
it and switch to Bernd's suggested 'buffer' query.

There could be, nonetheless, strange cases out there that won't fit  
the rosy picture I just painted:

* If someone out there has ever written some code that relied on  
'last_string' referring to the same object across multiple reads, then  
that code would be broken by Bernd's suggestion. There's only one way  
I can imagine this: aliasing of 'last_string'. This doesn't strike me  
as likely, but maybe it's a good reason to reject the idea.

* If there's poorly tested code out there written by someone who  
didn't understand 'last_string', then it may be waiting to explode one  
day. Maybe it does explode on rare occasions and whoever uses it has  
never figured out why. This code would automatically get fixed by  
Bernd's suggestion.

- Peter Gummer
Peter Horan

RE: Re: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
In reply to this post by panfriedwoggle
Panfriedwoggle wrote:

> Is this not also an example of the principle of Command-Query Separation?

Yes. Of course. (Applying a principle automatically means one tends to forget
about it when explaining a design choice).

Command-Query Separation (CQS) is the policy of restricting instructions either
   1. to command a change of state of an object
      io.read_line changes the state of the object `io',
or
   2. to query the state of an object
      io.last_string reports the last string read, but does not alter `io'.

It may not be appropriate, or it may not be possible, to apply CQS in certain situations.
--
Peter Horan             Faculty of Science and Technology
[hidden email]     Deakin University
+61-3-5221 1234 (Voice) Geelong, Victoria 3217, AUSTRALIA
+61-4-0831 2116 (Mobile)

-- The Eiffel guarantee: From specification to implementation
-- (http://www.cetus-links.org/oo_eiffel.html)
Chris Saunders-4

RE: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
In reply to this post by Peter Gummer-2
I think that the commenting style is largely to blame for this for fairly
new users of Eiffel.  I seem to remember having a similar problem when first
using Eiffel many years ago.  Suggested usage would be useful to me still.

Regards
Chris Saunders

From: [hidden email]
[mailto:[hidden email]] On Behalf Of Peter Gummer
Sent: November-04-09 1:29 AM
To: [hidden email]
Subject: Re: [eiffel_software] unexpected behavior of IO.read_line

 
Roger Osmond wrote:

> last_string_read: STRING
> do
> Result := last_string.twin
> end
>
> No big deal; No endless committee meetings, no broken code, no angst.

I don't think this fixes the problem. New users of Eiffel still won't
easily know that last_string should not be used for the purpose that
its name suggests.

Bernd's suggestion sounds sensible to me. Changing last_string to
return a copy of the buffer won't break any existing code, other than
making it run more slowly. If the slowness matters, people will notice
it and switch to Bernd's suggested 'buffer' query.

There could be, nonetheless, strange cases out there that won't fit
the rosy picture I just painted:

* If someone out there has ever written some code that relied on
'last_string' referring to the same object across multiple reads, then
that code would be broken by Bernd's suggestion. There's only one way
I can imagine this: aliasing of 'last_string'. This doesn't strike me
as likely, but maybe it's a good reason to reject the idea.

* If there's poorly tested code out there written by someone who
didn't understand 'last_string', then it may be waiting to explode one
day. Maybe it does explode on rare occasions and whoever uses it has
never figured out why. This code would automatically get fixed by
Bernd's suggestion.

- Peter Gummer


Paul Cohen

Re: Re: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
In reply to this post by Peter Horan
Hi,

On Wed, Nov 4, 2009 at 8:26 AM, Peter Horan <[hidden email]> wrote:
> Panfriedwoggle wrote:
> > Is this not also an example of the principle of Command-Query Separation?
>
> Yes. Of course. (Applying a principle automatically means one tends to forget
> about it when explaining a design choice).

The fact that a new string is not created every time is not based on
the principle of Command-Query separation. It is an implementation
choice made in the implementation of STD_FILES and related classes
that affects the client. It was made at a time when memory use and
performance issues where given very high priority. I think the choice
is bad. It is counter intuitive since you have different behaviour for
the other "last_*" features which implement "copy semantics". Also,
clients will need to make copies of the read strings in most cases
anyway.

The intuitive notion of strings has always occupied a place somewhere
between copy and reference semantics! This has made many string
related design issues tricky to resolve. Personally I think that
strings when viewed as a sequence of characters should be handled with
copy semantics. When strings are viewed or used as a sequence of
octets (usually and improperly called ASCII characters) they should be
handled with reference semantics. I am aware that this raises the
issue of mutable/immutable strings so I should probably stop ranting
here! ;-)

/Paul

--
Paul Cohen
www.seibostudios.se
mobile: +46 730 787 035
e-mail: [hidden email]
Bernd Schoeller-3

Re: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
In reply to this post by rfo
Hi Roger,

The name 'last_string' is bad. I do not think that a comment will
rectify this. The comment is not present when the feature is used.

The 'last_string' is not the last string. It is the same string. Naming
it 'last_string' implies that there are also "non-last" strings. But
there are not. 'last_string' is just a string buffer for the IO.

The following code make it easy to explain the 'twin':

io.read_string
line1 := io.read_buffer.twin
io.read_string
line2 := io.read_buffer.twin

With 'last_string', this just looks awkward. Good naming should make it
possible to read code without having to go to the definition of its
features (not that I am saying that this is always possible, but here it
is).

Bernd

On 11/3/09 9:09 PM, [hidden email] wrote:

>  
>
> Hi Bernd
>
> Actually, the irony comes from wanting to change the semantics of
> last_string, but not the "traditional" name. We should keep the name
> and semantics as they are, but provide a twin-ing function as an
> alternative (hardly a techno stretch). No breaks, no whining, etc. All
> that needs changing then (versus creating anew) is the documentation
> that says how to use it. All old code (and old programmers) can remain
> as they were. The original goal of leaving (potentially wasteful)
> object creation to the user's discretion remains, so those who adore
> copy semantics can have their way just as much as those who worship
> reference semantics.
>
> last_string_read: STRING
> do
> Result := last_string.twin
> end
>
> No big deal; No endless committee meetings, no broken code, no angst.
>
> R

--
Bernd Schoeller, PhD, CTO, Partner
Comerge AG, Bubenbergstrasse 11, CH-8045 Zurich, www.comerge.net
rfo

RE: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
In reply to this post by Marco Trudel
Hi Bernd!

I'm not defending 'last_string' certainly (last_*, new* are nothing but
trouble frankly).

I much prefer your notion of a buffer, but the name 'read_buffer' is
very bad because it is a verb followed by a noun - this indicates to
most western language speakers that this is an imperative.  This
confusion is further reinforced by all of the similarly named IO
features, like read_stream, read_integer, etc.

A name more clearly denoting a query is in order; perhaps
'buffer_content' or perhaps 'input_buffer_content' (though both input
and buffer can be considered verbs too), or maybe
content_of_input_buffer, just to send all of our keystrroke counting
friends over the edge.

What I don't want to see is existing code broken.  It's as simple as
that.  Eiffel users have been punished enough already.

If we want to deprecate last_string (and all the other last_* features),
I'm quite fine with that. For consistency, we would need to replace them
with queries of the form "*_as_integer" where '*' is whatever the actual
buffer contents gets named.  For example, content_of_buffer_as_integer.


     R

==================================================
Roger F. Osmond


> -------- Original Message --------
> Subject: Re: [eiffel_software] unexpected behavior of IO.read_line
> From: Bernd Schoeller <[hidden email]>
> Date: Wed, November 04, 2009 7:20 am
> To: [hidden email]
> Hi Roger,
> The name 'last_string' is bad. I do not think that a comment will
> rectify this. The comment is not present when the feature is used.
> The 'last_string' is not the last string. It is the same string. Naming
> it 'last_string' implies that there are also "non-last" strings. But
> there are not. 'last_string' is just a string buffer for the IO.
> The following code make it easy to explain the 'twin':
> io.read_string
> line1 := io.read_buffer.twin
> io.read_string
> line2 := io.read_buffer.twin
> With 'last_string', this just looks awkward. Good naming should make it
> possible to read code without having to go to the definition of its
> features (not that I am saying that this is always possible, but here it
> is).
> Bernd
> On 11/3/09 9:09 PM, [hidden email] wrote:
> >  
> >
> > Hi Bernd
> >
> > Actually, the irony comes from wanting to change the semantics of
> > last_string, but not the "traditional" name. We should keep the name
> > and semantics as they are, but provide a twin-ing function as an
> > alternative (hardly a techno stretch). No breaks, no whining, etc. All
> > that needs changing then (versus creating anew) is the documentation
> > that says how to use it. All old code (and old programmers) can remain
> > as they were. The original goal of leaving (potentially wasteful)
> > object creation to the user's discretion remains, so those who adore
> > copy semantics can have their way just as much as those who worship
> > reference semantics.
> >
> > last_string_read: STRING
> > do
> > Result := last_string.twin
> > end
> >
> > No big deal; No endless committee meetings, no broken code, no angst.
> >
> > R
> --
> Bernd Schoeller, PhD, CTO, Partner
> Comerge AG, Bubenbergstrasse 11, CH-8045 Zurich, www.comerge.net

Bernd Schoeller-3

Re: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
Hi Roger,

Thanks for your answer. Some more comments from my side:

> I much prefer your notion of a buffer, but the name 'read_buffer' is
> very bad because it is a verb followed by a noun - this indicates to
> most western language speakers that this is an imperative. This
> confusion is further reinforced by all of the similarly named IO
> features, like read_stream, read_integer, etc.

You definitely have a point there. 'string_buffer' or 'input_buffer'
might be good.

> What I don't want to see is existing code broken. It's as simple as
> that. Eiffel users have been punished enough already.

Point about the "punish" part, but do you have an example of real world
code that would break by renaming 'last_string' to 'input_buffer' and
then define a new 'last_string' to return input_buffer.twin?

Bernd

--
Bernd Schoeller, PhD, CTO, Partner
Comerge AG, Bubenbergstrasse 11, CH-8045 Zurich, www.comerge.net
Berend de Boer

Re: Re: unexpected behavior of IO.read_line

Reply Threaded More More options
Print post
Permalink
In reply to this post by Paul Cohen
>>>>> "Paul" == Paul Cohen <[hidden email]> writes:

    Paul> The fact that a new string is not created every time is not
    Paul> based on the principle of Command-Query separation. It is an
    Paul> implementation choice made in the implementation of
    Paul> STD_FILES and related classes that affects the client. It
    Paul> was made at a time when memory use and performance issues
    Paul> where given very high priority. I think the choice is
    Paul> bad. It is counter intuitive since you have different
    Paul> behaviour for the other "last_*" features which implement
    Paul> "copy semantics". Also, clients will need to make copies of
    Paul> the read strings in most cases anyway.

Your  points regarding naming, that could be (although everyone in
Eiffel uses this), but memory use is still an extremely important
criteria. You simply can't write anything that is performing without
it.

I think people pick up on this one very quickly, and it is much harder
to explain to programmers why their code is so slow, and it would give
Eiffel a bad rap as well, to everyone trying it for the first time.

It's just one of those things you will have to learn when programming Eiffel.

--
Cheers,

Berend de Boer