Forum: War Ensemble BBS

A technique from a chatbot

From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.lang.python on Tue Apr 2 17:18:16 2024

From Newsgroup: comp.lang.python

Some people can't believe it when I say that chatbots improve
my programming productivity. So, here's a technique I learned
from a chatbot!

It is a structured "break". "Break" still is a kind of jump,
you know?

So, what's a function to return the first word beginning with
an "e" in a given list, like for example

[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

? Well it's

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word

. "return" still can be considered a kind of "goto" statement.
It can lead to errors:

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word
something_to_be_done_at_the_end_of_this_function()

The call sometimes will not be executed here!
So, "return" is similar to "break" in that regard.

But in Python we can write:

def first_word_beginning_with_e( list_ ):
return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

. No jumps anymore, yet the loop is aborted on the first hit
(if I guess correctly how its working).

And it is this combination of "next", a generator, and "None" that
the chatbot showed me when I asked him how to get the first component
of a list that matches a condition!

PS: Let's verify the earliness of the exit out of the loop:

Main.py

def list_():
list__ =[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]
for entry in list__:
print( f'Now yielding {entry}.' )
yield entry

def first_word_beginning_with_e( list_ ):
return next( ( word for word in list_() if word[ 0 ]== 'e' ), None )

print( first_word_beginning_with_e( list_ ))

sys.stdout

Now yielding delta.
Now yielding epsilon.
epsilon
--- Synchronet 3.20a-Linux NewsLink 1.114

From piergiorgio.sartor.this.should.not.be.used@piergiorgio.sartor.this.should.not.be.used@nexgo.REMOVETHIS.de to comp.lang.python on Tue Apr 2 19:47:59 2024

From Newsgroup: comp.lang.python

On 02/04/2024 19.18, Stefan Ram wrote:

Some people can't believe it when I say that chatbots improve
my programming productivity. So, here's a technique I learned
from a chatbot!

It is a structured "break". "Break" still is a kind of jump,
you know?

So, what's a function to return the first word beginning with
an "e" in a given list, like for example

[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

? Well it's

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word

. "return" still can be considered a kind of "goto" statement.
It can lead to errors:

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word
something_to_be_done_at_the_end_of_this_function()

The call sometimes will not be executed here!
So, "return" is similar to "break" in that regard.

But in Python we can write:

def first_word_beginning_with_e( list_ ):
return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

Doesn't look a smart advice.

. No jumps anymore, yet the loop is aborted on the first hit

First of all, I fail to understand why there
should be no jumps any more.
It depends on how "return" and "if" are handled,
I guess, in different context.
Maybe they're just "masked".
In any case, the "compiler" should have just
done the same.

(if I guess correctly how its working).

Second, it is difficult to read, which is bad.
The "guess" above is just evidence of that.

My personal opinion about these "chatbots", is
that, while they might deliver clever solutions,
they are not explaining *why* these solutions
should be considered "clever".
Which is the most important thing (the solution
itself is _not_).

bye,
--

piergiorgio

--- Synchronet 3.20a-Linux NewsLink 1.114

From Thomas Passin@list1@tompassin.net to comp.lang.python on Tue Apr 2 15:31:26 2024

From Newsgroup: comp.lang.python

On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:

On 02/04/2024 19.18, Stefan Ram wrote:

   Some people can't believe it when I say that chatbots improve
   my programming productivity. So, here's a technique I learned
   from a chatbot!
   It is a structured "break". "Break" still is a kind of jump,
   you know?
   So, what's a function to return the first word beginning with
   an "e" in a given list, like for example
[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

   ? Well it's
def first_word_beginning_with_e( list_ ):
     for word in list_:
         if word[ 0 ]== 'e': return word

   . "return" still can be considered a kind of "goto" statement.
   It can lead to errors:

def first_word_beginning_with_e( list_ ):
     for word in list_:
         if word[ 0 ]== 'e': return word
     something_to_be_done_at_the_end_of_this_function()
   The call sometimes will not be executed here!
   So, "return" is similar to "break" in that regard.
   But in Python we can write:
def first_word_beginning_with_e( list_ ):
     return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

Doesn't look a smart advice.

   . No jumps anymore, yet the loop is aborted on the first hit

It's worse than "not a smart advice". This code constructs an
unnecessary tuple, then picks out its first element and returns that.
The something_to_be_done() function may or may not be called. And it's
harder to read and understand than necessary. Compare, for example,
with this version:

def first_word_beginning_with_e(target, wordlist):
result = ''
for w in wordlist:
if w.startswith(target):
res = w
break
do_something_else()
return result

If do_something_else() is supposed to fire only if the target is not
found, then this slight modification will do:

def first_word_beginning_with_e(target, wordlist):
result = ''
for w in wordlist:
if w.startswith(target):
res = w
break
else:
do_something_else()
return result

[Using the "target" argument instead of "target[0]" will let you match
an initial string instead of just a the first character].

First of all, I fail to understand why there
should be no jumps any more.
It depends on how "return" and "if" are handled,
I guess, in different context.
Maybe they're just "masked".
In any case, the "compiler" should have just
done the same.

   (if I guess correctly how its working).

Second, it is difficult to read, which is bad.
The "guess" above is just evidence of that.

My personal opinion about these "chatbots", is
that, while they might deliver clever solutions,
they are not explaining *why* these solutions
should be considered "clever".
Which is the most important thing (the solution
itself is _not_).

bye,

--- Synchronet 3.20a-Linux NewsLink 1.114

From avi.e.gross@avi.e.gross@gmail.com to comp.lang.python on Wed Apr 3 01:27:00 2024

From Newsgroup: comp.lang.python

I am a tad confused by a suggestion that any kind of GOTO variant is bad. The suggestion runs counter to the reality that underneath it all, compiled programs are chock full of GOTO variants even for simple things like IF-ELSE.
Consider the code here:

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word
something_to_be_done_at_the_end_of_this_function()

If instead the function initialized a variable to nothing useful and in the loop if it found a word beginning with e and it still contained nothing useful, copied it into the variable and then allowed the code to complete the loop and finally returned the variable, that would simply be a much less efficient solution to the problem and gain NOTHING. There are many variants you can come up with and when the conditions are complex and many points of immediate return, fine, then it may be dangerous. But a single return is fine.
The function does have a flaw as it is not clear what it should do if nothing is found. Calling a silly long name does not necessarily return anything.
Others, like Thomas, have shown other variants including some longer and more complex ways.
A fairly simple one-liner version, not necessarily efficient, would be to just use a list comprehension that makes a new list of just the ones matching the pattern of starting with an 'e' and then returns the first entry or None. This shows the code and test it:
text = ["eastern", "Western", "easter"]
NorEaster = ["North", "West", "orient"]
def first_word_beginning_with_e( list_ ):
return(result[0] if (result := [word for word in list_ if word[0].lower() == 'e']) else None)
print(first_word_beginning_with_e( text ))
print(first_word_beginning_with_e( NorEaster ))
Result of running it on a version of python ay least 3.8 so it supports the walrus operator:
eastern
None
-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Thomas Passin via Python-list
Sent: Tuesday, April 2, 2024 3:31 PM
To: python-list@python.org
Subject: Re: A technique from a chatbot
On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:

On 02/04/2024 19.18, Stefan Ram wrote:

Some people can't believe it when I say that chatbots improve
my programming productivity. So, here's a technique I learned
from a chatbot!
It is a structured "break". "Break" still is a kind of jump,
you know?
So, what's a function to return the first word beginning with
an "e" in a given list, like for example
[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

? Well it's
def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word

. "return" still can be considered a kind of "goto" statement.
It can lead to errors:

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word
something_to_be_done_at_the_end_of_this_function()
The call sometimes will not be executed here!
So, "return" is similar to "break" in that regard.
But in Python we can write:
def first_word_beginning_with_e( list_ ):
return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

Doesn't look a smart advice.

. No jumps anymore, yet the loop is aborted on the first hit

It's worse than "not a smart advice". This code constructs an
unnecessary tuple, then picks out its first element and returns that.
The something_to_be_done() function may or may not be called. And it's
harder to read and understand than necessary. Compare, for example,
with this version:
def first_word_beginning_with_e(target, wordlist):
result = ''
for w in wordlist:
if w.startswith(target):
res = w
break
do_something_else()
return result
If do_something_else() is supposed to fire only if the target is not
found, then this slight modification will do:
def first_word_beginning_with_e(target, wordlist):
result = ''
for w in wordlist:
if w.startswith(target):
res = w
break
else:
do_something_else()
return result
[Using the "target" argument instead of "target[0]" will let you match
an initial string instead of just a the first character].

First of all, I fail to understand why there
should be no jumps any more.
It depends on how "return" and "if" are handled,
I guess, in different context.
Maybe they're just "masked".
In any case, the "compiler" should have just
done the same.

(if I guess correctly how its working).

Second, it is difficult to read, which is bad.
The "guess" above is just evidence of that.

My personal opinion about these "chatbots", is
that, while they might deliver clever solutions,
they are not explaining *why* these solutions
should be considered "clever".
Which is the most important thing (the solution
itself is _not_).

bye,

--
https://mail.python.org/mailman/listinfo/python-list
--- Synchronet 3.20a-Linux NewsLink 1.114

From Thomas Passin@list1@tompassin.net to comp.lang.python on Wed Apr 3 07:50:55 2024

From Newsgroup: comp.lang.python

On 4/3/2024 1:27 AM, AVI GROSS via Python-list wrote:

I am a tad confused by a suggestion that any kind of GOTO variant is bad. The suggestion runs counter to the reality that underneath it all, compiled programs are chock full of GOTO variants even for simple things like IF-ELSE.

Consider the code here:

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word
something_to_be_done_at_the_end_of_this_function()

If instead the function initialized a variable to nothing useful and in the loop if it found a word beginning with e and it still contained nothing useful, copied it into the variable and then allowed the code to complete the loop and finally returned the variable, that would simply be a much less efficient solution to the problem and gain NOTHING. There are many variants you can come up with and when the conditions are complex and many points of immediate return, fine, then it may be dangerous. But a single return is fine.

The function does have a flaw as it is not clear what it should do if nothing is found. Calling a silly long name does not necessarily return anything.

Others, like Thomas, have shown other variants including some longer and more complex ways.

A fairly simple one-liner version, not necessarily efficient, would be to just use a list comprehension that makes a new list of just the ones matching the pattern of starting with an 'e' and then returns the first entry or None. This shows the code and test it:

text = ["eastern", "Western", "easter"]

NorEaster = ["North", "West", "orient"]

def first_word_beginning_with_e( list_ ):
return(result[0] if (result := [word for word in list_ if word[0].lower() == 'e']) else None)

print(first_word_beginning_with_e( text ))
print(first_word_beginning_with_e( NorEaster ))

Result of running it on a version of python ay least 3.8 so it supports the walrus operator:

eastern
None

The OP seems to want to return None if a match is not found. If a
Python function ends without a return statement, it automatically
returns None. So nothing special needs to be done. True, that is
probably a special case, but it suggests that the problem posed to the
chatbot was not posed well. A truly useful chatbot could have discussed
many of the points we've been discussing. That would have made for a
good learning experience. Instead the chatbot produced poorly
constructed code that caused a bad learning experience.

[snip...]

--- Synchronet 3.20a-Linux NewsLink 1.114

From avi.e.gross@avi.e.gross@gmail.com to comp.lang.python on Wed Apr 3 11:32:44 2024

From Newsgroup: comp.lang.python

Sadly, Thomas, this is not even all that new.

I have seen people do searches on the internet for how to do one thing at a time and then cobble together some code that does something but perhaps not quite what they intended. Some things are just inefficient such as reading
data from a file, doing some calculations, writing the results to another
file, reading them back in and doing more calculations and writing them out again and so on. Yes, there can be value in storing intermediate results but why read it in again when it is already in memory? And, in some cases, why
not do multiple steps instead of one at a time and so on.

How many people ask how to TEST the code they get, especially from an
AI-like ...?

-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Thomas Passin via Python-list
Sent: Wednesday, April 3, 2024 7:51 AM
To: python-list@python.org
Subject: Re: A technique from a chatbot

On 4/3/2024 1:27 AM, AVI GROSS via Python-list wrote:

I am a tad confused by a suggestion that any kind of GOTO variant is bad.

The suggestion runs counter to the reality that underneath it all, compiled programs are chock full of GOTO variants even for simple things like
IF-ELSE.

Consider the code here:

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word
something_to_be_done_at_the_end_of_this_function()

If instead the function initialized a variable to nothing useful and in

the loop if it found a word beginning with e and it still contained nothing useful, copied it into the variable and then allowed the code to complete
the loop and finally returned the variable, that would simply be a much less efficient solution to the problem and gain NOTHING. There are many variants
you can come up with and when the conditions are complex and many points of immediate return, fine, then it may be dangerous. But a single return is
fine.

The function does have a flaw as it is not clear what it should do if

nothing is found. Calling a silly long name does not necessarily return anything.

Others, like Thomas, have shown other variants including some longer and

more complex ways.

A fairly simple one-liner version, not necessarily efficient, would be to

just use a list comprehension that makes a new list of just the ones
matching the pattern of starting with an 'e' and then returns the first
entry or None. This shows the code and test it:

text = ["eastern", "Western", "easter"]

NorEaster = ["North", "West", "orient"]

def first_word_beginning_with_e( list_ ):
return(result[0] if (result := [word for word in list_ if

word[0].lower() == 'e']) else None)

print(first_word_beginning_with_e( text ))
print(first_word_beginning_with_e( NorEaster ))

Result of running it on a version of python ay least 3.8 so it supports

the walrus operator:

eastern
None

The OP seems to want to return None if a match is not found. If a
Python function ends without a return statement, it automatically
returns None. So nothing special needs to be done. True, that is
probably a special case, but it suggests that the problem posed to the
chatbot was not posed well. A truly useful chatbot could have discussed
many of the points we've been discussing. That would have made for a
good learning experience. Instead the chatbot produced poorly
constructed code that caused a bad learning experience.

[snip...]

--
https://mail.python.org/mailman/listinfo/python-list

--- Synchronet 3.20a-Linux NewsLink 1.114

From Gilmeh Serda@gilmeh.serda@nothing.here.invalid to comp.lang.python on Wed Apr 3 18:45:20 2024

From Newsgroup: comp.lang.python

On 2 Apr 2024 17:18:16 GMT, Stefan Ram wrote:

first_word_beginning_with_e

Here's another one:

def ret_first_eword():

... return [w for w in ['delta', 'epsilon', 'zeta', 'eta', 'theta'] if w.startswith('e')][0]
...

ret_first_eword()

'epsilon'
--
Gilmeh

Linux is addictive, I'm hooked! -- MaDsen Wikholm's .sig
--- Synchronet 3.20a-Linux NewsLink 1.114

From Pieter van Oostrum@pieter-l@vanoostrum.org to comp.lang.python on Wed Apr 3 23:15:18 2024

From Newsgroup: comp.lang.python

ram@zedat.fu-berlin.de (Stefan Ram) writes:

It can lead to errors:

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word
something_to_be_done_at_the_end_of_this_function()

The call sometimes will not be executed here!
So, "return" is similar to "break" in that regard.

That can be solved with finally:

def first_word_beginning_with_e( list_ ):
try:
for word in list_:
if word[ 0 ]== 'e': return word
finally:
print("something_to_be_done_at_the_end_of_this_function()")
--
Pieter van Oostrum <pieter@vanoostrum.org>
www: http://pieter.vanoostrum.org/
PGP key: [8DAE142BE17999C4]
--- Synchronet 3.20a-Linux NewsLink 1.114

From Michael F. Stemper@michael.stemper@gmail.com to comp.lang.python on Wed Apr 3 16:36:30 2024

From Newsgroup: comp.lang.python

On 03/04/2024 13.45, Gilmeh Serda wrote:

On 2 Apr 2024 17:18:16 GMT, Stefan Ram wrote:

first_word_beginning_with_e

Here's another one:

def ret_first_eword():

... return [w for w in ['delta', 'epsilon', 'zeta', 'eta', 'theta'] if w.startswith('e')][0]
...

ret_first_eword()

'epsilon'

Doesn't work in the case where there isn't a word starting with 'e':

>>> def find_e( l ):
... return [w for w in l if w.startswith('e')][0]
...
>>> l = ['delta', 'epsilon', 'zeta', 'eta', 'theta']
>>> find_e(l)
'epsilon'
>>> l = ['The','fan-jet','airline']
>>> find_e(l)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in find_e
IndexError: list index out of range
>>>
--
Michael F. Stemper
If it isn't running programs and it isn't fusing atoms, it's just bending space.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Gilmeh Serda@gilmeh.serda@nothing.here.invalid to comp.lang.python on Thu Apr 4 17:35:33 2024

From Newsgroup: comp.lang.python

On Wed, 3 Apr 2024 16:36:30 -0500, Michael F. Stemper wrote:

On 03/04/2024 13.45, Gilmeh Serda wrote:

On 2 Apr 2024 17:18:16 GMT, Stefan Ram wrote:

first_word_beginning_with_e

Here's another one:

def ret_first_eword():

... return [w for w in ['delta', 'epsilon', 'zeta', 'eta', 'theta']
if w.startswith('e')][0]
...

ret_first_eword()

'epsilon'

Doesn't work in the case where there isn't a word starting with 'e':

>>> def find_e( l ):
... return [w for w in l if w.startswith('e')][0]
...
>>> l = ['delta', 'epsilon', 'zeta', 'eta', 'theta']
>>> find_e(l)
'epsilon'
>>> l = ['The','fan-jet','airline']
>>> find_e(l)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in find_e
IndexError: list index out of range
>>>

Wow! It wasn't production code. And still isn't. (o.Ô)

def find_e(l):

... try:
... return [w for w in l if w.startswith('e')][0]
... except IndexError:
... return None # or 0 or '' or whatever you want
...

find_e(l)

--
Gilmeh

Drop in any mailbox.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Mark Bourne@nntp.mbourne@spamgourmet.com to comp.lang.python on Thu Apr 4 20:03:45 2024

From Newsgroup: comp.lang.python

Thomas Passin wrote:

On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:

On 02/04/2024 19.18, Stefan Ram wrote:

   Some people can't believe it when I say that chatbots improve
   my programming productivity. So, here's a technique I learned
   from a chatbot!
   It is a structured "break". "Break" still is a kind of jump,
   you know?
   So, what's a function to return the first word beginning with
   an "e" in a given list, like for example
[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

   ? Well it's
def first_word_beginning_with_e( list_ ):
     for word in list_:
         if word[ 0 ]== 'e': return word

   . "return" still can be considered a kind of "goto" statement.
   It can lead to errors:

def first_word_beginning_with_e( list_ ):
     for word in list_:
         if word[ 0 ]== 'e': return word
     something_to_be_done_at_the_end_of_this_function()
   The call sometimes will not be executed here!
   So, "return" is similar to "break" in that regard.
   But in Python we can write:
def first_word_beginning_with_e( list_ ):
     return next( ( word for word in list_ if word[ 0 ]== 'e' ), None ) >>

Doesn't look a smart advice.

   . No jumps anymore, yet the loop is aborted on the first hit

It's worse than "not a smart advice". This code constructs an
unnecessary tuple, then picks out its first element and returns that.

I don't think there's a tuple being created. If you mean:
( word for word in list_ if word[ 0 ]== 'e' )

...that's not creating a tuple. It's a generator expression, which
generates the next value each time it's called for. If you only ever
ask for the first item, it only generates that one.

When I first came across them, I did find it a bit odd that generator expressions look like the tuple equivalent of list/dictionary
comprehensions.

FWIW, if you actually wanted a tuple from that expression, you'd need to
pass the generator to tuple's constructor:
tuple(word for word in list_ if word[0] == 'e')
(You don't need to include an extra set of brackets when passing a
generator a the only argument to a function).
--
Mark.
--- Synchronet 3.20a-Linux NewsLink 1.114

From avi.e.gross@avi.e.gross@gmail.com to comp.lang.python on Thu Apr 4 16:33:57 2024

From Newsgroup: comp.lang.python

That is an excellent point, Mark. Some of the proposed variants to the requested problem, including mine, do indeed find all instances only to return the first. This can use additional time and space but when done, some of the overhead is also gone. What I mean is that a generator you create and invoke once, generally sits around indefinitely in your session unless it leaves your current range or something. It does only a part of the work and must remain suspended and ready to be called again to do more.
If you create a generator inside a function and the function returns, presumably it can be garbage-collected.
But if it is in the main body, I have to wonder what happen.
There seem to be several related scenarios to consider.
- You may want to find, in our example, a first instance. Right afterwards, you want the generator to disassemble anything in use.
- You may want the generator to stick around and later be able to return the next instance. The generator can only really go away when another call has been made after the last available instance and it cannot look for more beyond some end.
- Finally, you can call a generator with the goal of getting all instances such as by asking it to populate a list. In such a case, you may not necessarily want or need to use a generator expression and can use something straightforward and possible cheaper.
What confuses the issue, for me, is that you can make fairly complex calculations in python using various forms of generators that implement a sort of just-in-time approach as generators call other generators which call yet others and so on. Imagine having folders full of files that each contain a data structure such as a dictionary or set and writing functionality that searches for the first match for a key in any of the dictionaries (or sets or whatever) along the way? Now imagine that dictionary items can be a key value pair that can include the value being a deeper dictionary, perhaps down multiple levels.
You could get one generator that generates folder names or opens them and another that generates file names and reads in the data structure such as a dictionary and yet another that searches each dictionary and also any internally embedded dictionaries by calling another instance of the same generator as much as needed.
You can see how this creates and often consumes generators along the way as needed and in a sense does the minimum amount of work needed to find a first instance. But what might it leave open and taking up resources if not finished in a way that dismantles it?
Perhaps worse, imagine doing the search in parallel and as sone as it is found anywhere, ...
-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Mark Bourne via Python-list
Sent: Thursday, April 4, 2024 3:04 PM
To: python-list@python.org
Subject: Re: A technique from a chatbot
Thomas Passin wrote:

On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:

On 02/04/2024 19.18, Stefan Ram wrote:

Some people can't believe it when I say that chatbots improve
my programming productivity. So, here's a technique I learned
from a chatbot!
It is a structured "break". "Break" still is a kind of jump,
you know?
So, what's a function to return the first word beginning with
an "e" in a given list, like for example
[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

? Well it's
def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word

. "return" still can be considered a kind of "goto" statement.
It can lead to errors:

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word
something_to_be_done_at_the_end_of_this_function()
The call sometimes will not be executed here!
So, "return" is similar to "break" in that regard.
But in Python we can write:
def first_word_beginning_with_e( list_ ):
return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

Doesn't look a smart advice.

. No jumps anymore, yet the loop is aborted on the first hit

It's worse than "not a smart advice". This code constructs an
unnecessary tuple, then picks out its first element and returns that.

I don't think there's a tuple being created. If you mean:
( word for word in list_ if word[ 0 ]== 'e' )
...that's not creating a tuple. It's a generator expression, which
generates the next value each time it's called for. If you only ever
ask for the first item, it only generates that one.
When I first came across them, I did find it a bit odd that generator expressions look like the tuple equivalent of list/dictionary
comprehensions.
FWIW, if you actually wanted a tuple from that expression, you'd need to pass the generator to tuple's constructor:
tuple(word for word in list_ if word[0] == 'e')
(You don't need to include an extra set of brackets when passing a
generator a the only argument to a function).
--
Mark.
--
https://mail.python.org/mailman/listinfo/python-list
--- Synchronet 3.20a-Linux NewsLink 1.114

From Thomas Passin@list1@tompassin.net to comp.lang.python on Thu Apr 4 17:10:34 2024

From Newsgroup: comp.lang.python

On 4/4/2024 3:03 PM, Mark Bourne via Python-list wrote:

Thomas Passin wrote:

On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:

On 02/04/2024 19.18, Stefan Ram wrote:

   Some people can't believe it when I say that chatbots improve
   my programming productivity. So, here's a technique I learned
   from a chatbot!
   It is a structured "break". "Break" still is a kind of jump,
   you know?
   So, what's a function to return the first word beginning with
   an "e" in a given list, like for example
[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

   ? Well it's
def first_word_beginning_with_e( list_ ):
     for word in list_:
         if word[ 0 ]== 'e': return word

   . "return" still can be considered a kind of "goto" statement.
   It can lead to errors:

def first_word_beginning_with_e( list_ ):
     for word in list_:
         if word[ 0 ]== 'e': return word
     something_to_be_done_at_the_end_of_this_function()
   The call sometimes will not be executed here!
   So, "return" is similar to "break" in that regard.
   But in Python we can write:
def first_word_beginning_with_e( list_ ):
     return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

Doesn't look a smart advice.

   . No jumps anymore, yet the loop is aborted on the first hit

It's worse than "not a smart advice". This code constructs an
unnecessary tuple, then picks out its first element and returns that.

I don't think there's a tuple being created. If you mean:
    ( word for word in list_ if word[ 0 ]== 'e' )

...that's not creating a tuple. It's a generator expression, which generates the next value each time it's called for. If you only ever
ask for the first item, it only generates that one.

Yes, I was careless when I wrote that. Still, the tuple machinery has to
be created and that's not necessary here. My point was that you are
asking the Python machinery to do extra work for no benefit in
performance or readability.

When I first came across them, I did find it a bit odd that generator expressions look like the tuple equivalent of list/dictionary comprehensions.

FWIW, if you actually wanted a tuple from that expression, you'd need to pass the generator to tuple's constructor:
    tuple(word for word in list_ if word[0] == 'e')
(You don't need to include an extra set of brackets when passing a
generator a the only argument to a function).

--- Synchronet 3.20a-Linux NewsLink 1.114

From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.lang.python on Fri Apr 5 18:29:22 2024

From Newsgroup: comp.lang.python

Mark Bourne <nntp.mbourne@spamgourmet.com> wrote or quoted:

I don't think there's a tuple being created. If you mean:
( word for word in list_ if word[ 0 ]== 'e' )
...that's not creating a tuple. It's a generator expression, which >generates the next value each time it's called for. If you only ever
ask for the first item, it only generates that one.

Yes, that's also how I understand it!

In the meantime, I wrote code for a microbenchmark, shown below.

This code, when executed on my computer, shows that the
next+generator approach is a bit faster when compared with
the procedural break approach. But when the order of the two
approaches is being swapped in the loop, then it is shown to
be a bit slower. So let's say, it takes about the same time.

However, I also tested code with an early return (not shown below),
and this was shown to be faster than both code using break and
code using next+generator by a factor of about 1.6, even though
the code with return has the "function call overhead"!

But please be aware that such results depend on the implementation
and version of the Python implementation being used for the benchmark
and also of the details of how exactly the benchmark is written.

import random
import string
import timeit

print( 'The following loop may need a few seconds or minutes, '
'so please bear with me.' )

time_using_break = 0
time_using_next = 0

for repetition in range( 100 ):
for i in range( 100 ): # Yes, this nesting is redundant!

list_ = \
[ ''.join \
( random.choices \
( string.ascii_lowercase, k=random.randint( 1, 30 )))
for i in range( random.randint( 0, 50 ))]

start_time = timeit.default_timer()
for word in list_:
if word[ 0 ]== 'e':
word_using_break = word
break
else:
word_using_break = ''
time_using_break += timeit.default_timer() - start_time

start_time = timeit.default_timer()
word_using_next = \
next( ( word for word in list_ if word[ 0 ]== 'e' ), '' )
time_using_next += timeit.default_timer() - start_time

if word_using_next != word_using_break:
raise Exception( 'word_using_next != word_using_break' )

print( f'{time_using_break = }' )
print( f'{time_using_next = }' )
print( f'{time_using_next / time_using_break = }' )
--- Synchronet 3.20a-Linux NewsLink 1.114

From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.lang.python on Fri Apr 5 18:32:22 2024

From Newsgroup: comp.lang.python

ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:

However, I also tested code with an early return (not shown below),
and this was shown to be faster than both code using break and
code using next+generator by a factor of about 1.6, even though
the code with return has the "function call overhead"!

See "return" benchmarked against "break" below!

import random
import string
import timeit

print( 'The following loop may need a few seconds or minutes, '
'so please bear with me.' )

def get_word_using_return( list_ ):
for word in list_:
if word[ 0 ]== 'e':
return word
return ''

time_using_break = 0
time_using_return = 0

for repetition in range( 100 ):
for i in range( 100 ): # Yes, this nesting is redundant!

list_ = \
[ ''.join \
( random.choices \
( string.ascii_lowercase, k=random.randint( 1, 30 )))
for i in range( random.randint( 0, 50 ))]

start_time = timeit.default_timer()
for word in list_:
if word[ 0 ]== 'e':
word_using_break = word
break
else:
word_using_break = ''
time_using_break += timeit.default_timer() - start_time

start_time = timeit.default_timer()
word_using_return = get_word_using_return( list_ )
time_using_return += timeit.default_timer() - start_time

if word_using_return != word_using_break:
raise Exception( 'word_using_return != word_using_break' )

print( f'{time_using_break = }' )
print( f'{time_using_return = }' )
print( f'{time_using_return / time_using_break = }' )
--- Synchronet 3.20a-Linux NewsLink 1.114

From Mark Bourne@nntp.mbourne@spamgourmet.com to comp.lang.python on Fri Apr 5 20:42:15 2024

From Newsgroup: comp.lang.python

avi.e.gross@gmail.com wrote:

That is an excellent point, Mark. Some of the proposed variants to the requested problem, including mine, do indeed find all instances only to return the first. This can use additional time and space but when done, some of the overhead is also gone. What I mean is that a generator you create and invoke once, generally sits around indefinitely in your session unless it leaves your current range or something. It does only a part of the work and must remain suspended and ready to be called again to do more.

It goes out of scope at the end of the function. Unless you return it
or store a reference to it elsewhere, it will then be deleted.

Or in this case, since the `first_word_beginning_with_e` function
doesn't even have a local reference to the generator (it is just created
and immediately passed as an argument to `next`), it goes out of scope
once the `next` function returns.

If you create a generator inside a function and the function returns, presumably it can be garbage-collected.

Exactly. It probably doesn't even need to wait for garbage collection -
once the reference count is zero, it can be destroyed.

But if it is in the main body, I have to wonder what happen.

If you mean in the top-level module scope outside of any
function/method, then it would remain in memory until the process exits.

There seem to be several related scenarios to consider.

- You may want to find, in our example, a first instance. Right afterwards, you want the generator to disassemble anything in use.
- You may want the generator to stick around and later be able to return the next instance. The generator can only really go away when another call has been made after the last available instance and it cannot look for more beyond some end.
- Finally, you can call a generator with the goal of getting all instances such as by asking it to populate a list. In such a case, you may not necessarily want or need to use a generator expression and can use something straightforward and possible cheaper.

Yes, so you create and assign it at an appropriate scope. In the
example here, it's just passed to `next` and then destroyed. Passing a generator to the `list` constructor (or the `tuple` constructor in my
"FWIW") would behave similarly - you'd get the final list/tuple back,
but the generator would be destroyed once that call is done. If you
assigned it to a function-local variable, it would exist until the end
of that function.

What confuses the issue, for me, is that you can make fairly complex calculations in python using various forms of generators that implement a sort of just-in-time approach as generators call other generators which call yet others and so on.

Yes, you can. It can be quite useful when used appropriately.

Imagine having folders full of files that each contain a data structure such as a dictionary or set and writing functionality that searches for the first match for a key in any of the dictionaries (or sets or whatever) along the way? Now imagine that dictionary items can be a key value pair that can include the value being a deeper dictionary, perhaps down multiple levels.

You could get one generator that generates folder names or opens them and another that generates file names and reads in the data structure such as a dictionary and yet another that searches each dictionary and also any internally embedded dictionaries by calling another instance of the same generator as much as needed.

You probably could do that. Personally, I probably wouldn't use
generators for that, or at least not custom ones - if you're talking
about iterating over directories and files on disk, I'd probably just
use `os.walk` (which probably is a generator) and iterate over that,
opening each file and doing whatever you want with the contents.

You can see how this creates and often consumes generators along the way as needed and in a sense does the minimum amount of work needed to find a first instance. But what might it leave open and taking up resources if not finished in a way that dismantles it?

You'd need to make sure any files are closed (`with open(...)` helps
with that). If you're opening files within a generator, I'm pretty sure
you can do something like:
```
def iter_files(directory):
for filename in directory:
with open(filename) as f:
yield f
```

Then the file will be closed when the iterator leaves the `with` block
and moved on to the next item (presumably there's some mechanism for the context manager's `__exit__` to be called if the generator is destroyed without having iterated over the items - the whole point of using `with`
is that `__exit__` is guaranteed to be called whatever happens).

Other than that, the generators themselves would be destroyed once they
go out of scope. If there are no references to a generator left,
nothing is going to be able to call `next` (nor anything else) on it, so
no need for it to be kept hanging around in memory.

Perhaps worse, imagine doing the search in parallel and as sone as it is found anywhere, ...

-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Mark Bourne via Python-list
Sent: Thursday, April 4, 2024 3:04 PM
To: python-list@python.org
Subject: Re: A technique from a chatbot

Thomas Passin wrote:

On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:

On 02/04/2024 19.18, Stefan Ram wrote:

Some people can't believe it when I say that chatbots improve
my programming productivity. So, here's a technique I learned
from a chatbot!
It is a structured "break". "Break" still is a kind of jump,
you know?
So, what's a function to return the first word beginning with
an "e" in a given list, like for example
[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

? Well it's
def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word

. "return" still can be considered a kind of "goto" statement.
It can lead to errors:

def first_word_beginning_with_e( list_ ):
for word in list_:
if word[ 0 ]== 'e': return word
something_to_be_done_at_the_end_of_this_function()
The call sometimes will not be executed here!
So, "return" is similar to "break" in that regard.
But in Python we can write:
def first_word_beginning_with_e( list_ ):
return next( ( word for word in list_ if word[ 0 ]== 'e' ), None ) >>>

Doesn't look a smart advice.

. No jumps anymore, yet the loop is aborted on the first hit

It's worse than "not a smart advice". This code constructs an
unnecessary tuple, then picks out its first element and returns that.

I don't think there's a tuple being created. If you mean:
( word for word in list_ if word[ 0 ]== 'e' )

...that's not creating a tuple. It's a generator expression, which
generates the next value each time it's called for. If you only ever
ask for the first item, it only generates that one.

When I first came across them, I did find it a bit odd that generator expressions look like the tuple equivalent of list/dictionary
comprehensions.

FWIW, if you actually wanted a tuple from that expression, you'd need to
pass the generator to tuple's constructor:
tuple(word for word in list_ if word[0] == 'e')
(You don't need to include an extra set of brackets when passing a
generator a the only argument to a function).

--- Synchronet 3.20a-Linux NewsLink 1.114

From Mark Bourne@nntp.mbourne@spamgourmet.com to comp.lang.python on Fri Apr 5 20:59:54 2024

From Newsgroup: comp.lang.python

Stefan Ram wrote:

Mark Bourne <nntp.mbourne@spamgourmet.com> wrote or quoted:

I don't think there's a tuple being created. If you mean:
( word for word in list_ if word[ 0 ]== 'e' )
...that's not creating a tuple. It's a generator expression, which
generates the next value each time it's called for. If you only ever
ask for the first item, it only generates that one.

Yes, that's also how I understand it!

In the meantime, I wrote code for a microbenchmark, shown below.

This code, when executed on my computer, shows that the
next+generator approach is a bit faster when compared with
the procedural break approach. But when the order of the two
approaches is being swapped in the loop, then it is shown to
be a bit slower. So let's say, it takes about the same time.

There could be some caching going on, meaning whichever is done second
comes out a bit faster.

However, I also tested code with an early return (not shown below),
and this was shown to be faster than both code using break and
code using next+generator by a factor of about 1.6, even though
the code with return has the "function call overhead"!

To be honest, that's how I'd probably write it - not because of any
thought that it might be faster, but just that's it's clearer. And if
there's a `do_something_else()` that needs to be called regardless of
the whether a word was found, split it into two functions:
```
def first_word_beginning_with_e(target, wordlist):
for w in wordlist:
if w.startswith(target):
return w
return ''

def find_word_and_do_something_else(target, wordlist):
result = first_word_beginning_with_e(target, wordlist)
do_something_else()
return result
```

But please be aware that such results depend on the implementation
and version of the Python implementation being used for the benchmark
and also of the details of how exactly the benchmark is written.

import random
import string
import timeit

print( 'The following loop may need a few seconds or minutes, '
'so please bear with me.' )

time_using_break = 0
time_using_next = 0

for repetition in range( 100 ):
for i in range( 100 ): # Yes, this nesting is redundant!

list_ = \
[ ''.join \
( random.choices \
( string.ascii_lowercase, k=random.randint( 1, 30 )))
for i in range( random.randint( 0, 50 ))]

start_time = timeit.default_timer()
for word in list_:
if word[ 0 ]== 'e':
word_using_break = word
break
else:
word_using_break = ''
time_using_break += timeit.default_timer() - start_time

start_time = timeit.default_timer()
word_using_next = \
next( ( word for word in list_ if word[ 0 ]== 'e' ), '' )
time_using_next += timeit.default_timer() - start_time

if word_using_next != word_using_break:
raise Exception( 'word_using_next != word_using_break' )

print( f'{time_using_break = }' )
print( f'{time_using_next = }' )
print( f'{time_using_next / time_using_break = }' )

--- Synchronet 3.20a-Linux NewsLink 1.114

Who's Online
Recent Visitors
- Microbot
  Sun May 5 22:00:44 2024
  from Moore, Ok via Telnet
- Microbot
  Mon May 6 20:15:29 2024
  from Moore, Ok via Telnet
- Duke
  Mon May 6 11:17:35 2024
  from London via Telnet
- Grey Gamer
  Mon May 6 07:57:21 2024
  from Show Low, Az via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	920
Nodes:	10 (1 / 9)
Uptime:	85:56:03
Calls:	12,188
Calls today:	3
Files:	186,526
Messages:	2,237,082

A technique from a chatbot

Who's Online

Recent Visitors

System Info