• A technique from a chatbot

    From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.lang.python on Tue Apr 2 17:18:16 2024
    From Newsgroup: comp.lang.python

    Some people can't believe it when I say that chatbots improve
    my programming productivity. So, here's a technique I learned
    from a chatbot!

    It is a structured "break". "Break" still is a kind of jump,
    you know?

    So, what's a function to return the first word beginning with
    an "e" in a given list, like for example

    [ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

    ? Well it's

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word

    . "return" still can be considered a kind of "goto" statement.
    It can lead to errors:

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word
    something_to_be_done_at_the_end_of_this_function()

    The call sometimes will not be executed here!
    So, "return" is similar to "break" in that regard.

    But in Python we can write:

    def first_word_beginning_with_e( list_ ):
    return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

    . No jumps anymore, yet the loop is aborted on the first hit
    (if I guess correctly how its working).

    And it is this combination of "next", a generator, and "None" that
    the chatbot showed me when I asked him how to get the first component
    of a list that matches a condition!

    PS: Let's verify the earliness of the exit out of the loop:

    Main.py

    def list_():
    list__ =[ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]
    for entry in list__:
    print( f'Now yielding {entry}.' )
    yield entry

    def first_word_beginning_with_e( list_ ):
    return next( ( word for word in list_() if word[ 0 ]== 'e' ), None )

    print( first_word_beginning_with_e( list_ ))

    sys.stdout

    Now yielding delta.
    Now yielding epsilon.
    epsilon
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From piergiorgio.sartor.this.should.not.be.used@piergiorgio.sartor.this.should.not.be.used@nexgo.REMOVETHIS.de to comp.lang.python on Tue Apr 2 19:47:59 2024
    From Newsgroup: comp.lang.python

    On 02/04/2024 19.18, Stefan Ram wrote:
    Some people can't believe it when I say that chatbots improve
    my programming productivity. So, here's a technique I learned
    from a chatbot!

    It is a structured "break". "Break" still is a kind of jump,
    you know?

    So, what's a function to return the first word beginning with
    an "e" in a given list, like for example

    [ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

    ? Well it's

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word

    . "return" still can be considered a kind of "goto" statement.
    It can lead to errors:

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word
    something_to_be_done_at_the_end_of_this_function()

    The call sometimes will not be executed here!
    So, "return" is similar to "break" in that regard.

    But in Python we can write:

    def first_word_beginning_with_e( list_ ):
    return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

    Doesn't look a smart advice.

    . No jumps anymore, yet the loop is aborted on the first hit

    First of all, I fail to understand why there
    should be no jumps any more.
    It depends on how "return" and "if" are handled,
    I guess, in different context.
    Maybe they're just "masked".
    In any case, the "compiler" should have just
    done the same.

    (if I guess correctly how its working).

    Second, it is difficult to read, which is bad.
    The "guess" above is just evidence of that.

    My personal opinion about these "chatbots", is
    that, while they might deliver clever solutions,
    they are not explaining *why* these solutions
    should be considered "clever".
    Which is the most important thing (the solution
    itself is _not_).

    bye,
    --

    piergiorgio

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Thomas Passin@list1@tompassin.net to comp.lang.python on Tue Apr 2 15:31:26 2024
    From Newsgroup: comp.lang.python

    On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:
    On 02/04/2024 19.18, Stefan Ram wrote:
       Some people can't believe it when I say that chatbots improve
       my programming productivity. So, here's a technique I learned
       from a chatbot!
       It is a structured "break". "Break" still is a kind of jump,
       you know?
       So, what's a function to return the first word beginning with
       an "e" in a given list, like for example
    [ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

       ? Well it's
    def first_word_beginning_with_e( list_ ):
         for word in list_:
             if word[ 0 ]== 'e': return word

       . "return" still can be considered a kind of "goto" statement.
       It can lead to errors:

    def first_word_beginning_with_e( list_ ):
         for word in list_:
             if word[ 0 ]== 'e': return word
         something_to_be_done_at_the_end_of_this_function()
       The call sometimes will not be executed here!
       So, "return" is similar to "break" in that regard.
       But in Python we can write:
    def first_word_beginning_with_e( list_ ):
         return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

    Doesn't look a smart advice.

       . No jumps anymore, yet the loop is aborted on the first hit

    It's worse than "not a smart advice". This code constructs an
    unnecessary tuple, then picks out its first element and returns that.
    The something_to_be_done() function may or may not be called. And it's
    harder to read and understand than necessary. Compare, for example,
    with this version:

    def first_word_beginning_with_e(target, wordlist):
    result = ''
    for w in wordlist:
    if w.startswith(target):
    res = w
    break
    do_something_else()
    return result

    If do_something_else() is supposed to fire only if the target is not
    found, then this slight modification will do:

    def first_word_beginning_with_e(target, wordlist):
    result = ''
    for w in wordlist:
    if w.startswith(target):
    res = w
    break
    else:
    do_something_else()
    return result

    [Using the "target" argument instead of "target[0]" will let you match
    an initial string instead of just a the first character].

    First of all, I fail to understand why there
    should be no jumps any more.
    It depends on how "return" and "if" are handled,
    I guess, in different context.
    Maybe they're just "masked".
    In any case, the "compiler" should have just
    done the same.

       (if I guess correctly how its working).

    Second, it is difficult to read, which is bad.
    The "guess" above is just evidence of that.

    My personal opinion about these "chatbots", is
    that, while they might deliver clever solutions,
    they are not explaining *why* these solutions
    should be considered "clever".
    Which is the most important thing (the solution
    itself is _not_).

    bye,


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From avi.e.gross@avi.e.gross@gmail.com to comp.lang.python on Wed Apr 3 01:27:00 2024
    From Newsgroup: comp.lang.python

    I am a tad confused by a suggestion that any kind of GOTO variant is bad. The suggestion runs counter to the reality that underneath it all, compiled programs are chock full of GOTO variants even for simple things like IF-ELSE.
    Consider the code here:
    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word
    something_to_be_done_at_the_end_of_this_function()
    If instead the function initialized a variable to nothing useful and in the loop if it found a word beginning with e and it still contained nothing useful, copied it into the variable and then allowed the code to complete the loop and finally returned the variable, that would simply be a much less efficient solution to the problem and gain NOTHING. There are many variants you can come up with and when the conditions are complex and many points of immediate return, fine, then it may be dangerous. But a single return is fine.
    The function does have a flaw as it is not clear what it should do if nothing is found. Calling a silly long name does not necessarily return anything.
    Others, like Thomas, have shown other variants including some longer and more complex ways.
    A fairly simple one-liner version, not necessarily efficient, would be to just use a list comprehension that makes a new list of just the ones matching the pattern of starting with an 'e' and then returns the first entry or None. This shows the code and test it:
    text = ["eastern", "Western", "easter"]
    NorEaster = ["North", "West", "orient"]
    def first_word_beginning_with_e( list_ ):
    return(result[0] if (result := [word for word in list_ if word[0].lower() == 'e']) else None)
    print(first_word_beginning_with_e( text ))
    print(first_word_beginning_with_e( NorEaster ))
    Result of running it on a version of python ay least 3.8 so it supports the walrus operator:
    eastern
    None
    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Thomas Passin via Python-list
    Sent: Tuesday, April 2, 2024 3:31 PM
    To: python-list@python.org
    Subject: Re: A technique from a chatbot
    On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:
    On 02/04/2024 19.18, Stefan Ram wrote:
    Some people can't believe it when I say that chatbots improve
    my programming productivity. So, here's a technique I learned
    from a chatbot!
    It is a structured "break". "Break" still is a kind of jump,
    you know?
    So, what's a function to return the first word beginning with
    an "e" in a given list, like for example
    [ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

    ? Well it's
    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word

    . "return" still can be considered a kind of "goto" statement.
    It can lead to errors:

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word
    something_to_be_done_at_the_end_of_this_function()
    The call sometimes will not be executed here!
    So, "return" is similar to "break" in that regard.
    But in Python we can write:
    def first_word_beginning_with_e( list_ ):
    return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

    Doesn't look a smart advice.

    . No jumps anymore, yet the loop is aborted on the first hit
    It's worse than "not a smart advice". This code constructs an
    unnecessary tuple, then picks out its first element and returns that.
    The something_to_be_done() function may or may not be called. And it's
    harder to read and understand than necessary. Compare, for example,
    with this version:
    def first_word_beginning_with_e(target, wordlist):
    result = ''
    for w in wordlist:
    if w.startswith(target):
    res = w
    break
    do_something_else()
    return result
    If do_something_else() is supposed to fire only if the target is not
    found, then this slight modification will do:
    def first_word_beginning_with_e(target, wordlist):
    result = ''
    for w in wordlist:
    if w.startswith(target):
    res = w
    break
    else:
    do_something_else()
    return result
    [Using the "target" argument instead of "target[0]" will let you match
    an initial string instead of just a the first character].
    First of all, I fail to understand why there
    should be no jumps any more.
    It depends on how "return" and "if" are handled,
    I guess, in different context.
    Maybe they're just "masked".
    In any case, the "compiler" should have just
    done the same.

    (if I guess correctly how its working).

    Second, it is difficult to read, which is bad.
    The "guess" above is just evidence of that.

    My personal opinion about these "chatbots", is
    that, while they might deliver clever solutions,
    they are not explaining *why* these solutions
    should be considered "clever".
    Which is the most important thing (the solution
    itself is _not_).

    bye,

    --
    https://mail.python.org/mailman/listinfo/python-list
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Thomas Passin@list1@tompassin.net to comp.lang.python on Wed Apr 3 07:50:55 2024
    From Newsgroup: comp.lang.python

    On 4/3/2024 1:27 AM, AVI GROSS via Python-list wrote:
    I am a tad confused by a suggestion that any kind of GOTO variant is bad. The suggestion runs counter to the reality that underneath it all, compiled programs are chock full of GOTO variants even for simple things like IF-ELSE.

    Consider the code here:

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word
    something_to_be_done_at_the_end_of_this_function()

    If instead the function initialized a variable to nothing useful and in the loop if it found a word beginning with e and it still contained nothing useful, copied it into the variable and then allowed the code to complete the loop and finally returned the variable, that would simply be a much less efficient solution to the problem and gain NOTHING. There are many variants you can come up with and when the conditions are complex and many points of immediate return, fine, then it may be dangerous. But a single return is fine.

    The function does have a flaw as it is not clear what it should do if nothing is found. Calling a silly long name does not necessarily return anything.

    Others, like Thomas, have shown other variants including some longer and more complex ways.

    A fairly simple one-liner version, not necessarily efficient, would be to just use a list comprehension that makes a new list of just the ones matching the pattern of starting with an 'e' and then returns the first entry or None. This shows the code and test it:

    text = ["eastern", "Western", "easter"]

    NorEaster = ["North", "West", "orient"]

    def first_word_beginning_with_e( list_ ):
    return(result[0] if (result := [word for word in list_ if word[0].lower() == 'e']) else None)

    print(first_word_beginning_with_e( text ))
    print(first_word_beginning_with_e( NorEaster ))

    Result of running it on a version of python ay least 3.8 so it supports the walrus operator:

    eastern
    None

    The OP seems to want to return None if a match is not found. If a
    Python function ends without a return statement, it automatically
    returns None. So nothing special needs to be done. True, that is
    probably a special case, but it suggests that the problem posed to the
    chatbot was not posed well. A truly useful chatbot could have discussed
    many of the points we've been discussing. That would have made for a
    good learning experience. Instead the chatbot produced poorly
    constructed code that caused a bad learning experience.


    [snip...]

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From avi.e.gross@avi.e.gross@gmail.com to comp.lang.python on Wed Apr 3 11:32:44 2024
    From Newsgroup: comp.lang.python

    Sadly, Thomas, this is not even all that new.

    I have seen people do searches on the internet for how to do one thing at a time and then cobble together some code that does something but perhaps not quite what they intended. Some things are just inefficient such as reading
    data from a file, doing some calculations, writing the results to another
    file, reading them back in and doing more calculations and writing them out again and so on. Yes, there can be value in storing intermediate results but why read it in again when it is already in memory? And, in some cases, why
    not do multiple steps instead of one at a time and so on.

    How many people ask how to TEST the code they get, especially from an
    AI-like ...?



    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Thomas Passin via Python-list
    Sent: Wednesday, April 3, 2024 7:51 AM
    To: python-list@python.org
    Subject: Re: A technique from a chatbot

    On 4/3/2024 1:27 AM, AVI GROSS via Python-list wrote:
    I am a tad confused by a suggestion that any kind of GOTO variant is bad.
    The suggestion runs counter to the reality that underneath it all, compiled programs are chock full of GOTO variants even for simple things like
    IF-ELSE.

    Consider the code here:

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word
    something_to_be_done_at_the_end_of_this_function()

    If instead the function initialized a variable to nothing useful and in
    the loop if it found a word beginning with e and it still contained nothing useful, copied it into the variable and then allowed the code to complete
    the loop and finally returned the variable, that would simply be a much less efficient solution to the problem and gain NOTHING. There are many variants
    you can come up with and when the conditions are complex and many points of immediate return, fine, then it may be dangerous. But a single return is
    fine.

    The function does have a flaw as it is not clear what it should do if
    nothing is found. Calling a silly long name does not necessarily return anything.

    Others, like Thomas, have shown other variants including some longer and
    more complex ways.

    A fairly simple one-liner version, not necessarily efficient, would be to
    just use a list comprehension that makes a new list of just the ones
    matching the pattern of starting with an 'e' and then returns the first
    entry or None. This shows the code and test it:

    text = ["eastern", "Western", "easter"]

    NorEaster = ["North", "West", "orient"]

    def first_word_beginning_with_e( list_ ):
    return(result[0] if (result := [word for word in list_ if
    word[0].lower() == 'e']) else None)

    print(first_word_beginning_with_e( text ))
    print(first_word_beginning_with_e( NorEaster ))

    Result of running it on a version of python ay least 3.8 so it supports
    the walrus operator:

    eastern
    None

    The OP seems to want to return None if a match is not found. If a
    Python function ends without a return statement, it automatically
    returns None. So nothing special needs to be done. True, that is
    probably a special case, but it suggests that the problem posed to the
    chatbot was not posed well. A truly useful chatbot could have discussed
    many of the points we've been discussing. That would have made for a
    good learning experience. Instead the chatbot produced poorly
    constructed code that caused a bad learning experience.


    [snip...]
    --
    https://mail.python.org/mailman/listinfo/python-list

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Gilmeh Serda@gilmeh.serda@nothing.here.invalid to comp.lang.python on Wed Apr 3 18:45:20 2024
    From Newsgroup: comp.lang.python

    On 2 Apr 2024 17:18:16 GMT, Stefan Ram wrote:

    first_word_beginning_with_e

    Here's another one:

    def ret_first_eword():
    ... return [w for w in ['delta', 'epsilon', 'zeta', 'eta', 'theta'] if w.startswith('e')][0]
    ...
    ret_first_eword()
    'epsilon'
    --
    Gilmeh

    Linux is addictive, I'm hooked! -- MaDsen Wikholm's .sig
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Pieter van Oostrum@pieter-l@vanoostrum.org to comp.lang.python on Wed Apr 3 23:15:18 2024
    From Newsgroup: comp.lang.python

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    It can lead to errors:

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word
    something_to_be_done_at_the_end_of_this_function()

    The call sometimes will not be executed here!
    So, "return" is similar to "break" in that regard.

    That can be solved with finally:

    def first_word_beginning_with_e( list_ ):
    try:
    for word in list_:
    if word[ 0 ]== 'e': return word
    finally:
    print("something_to_be_done_at_the_end_of_this_function()")
    --
    Pieter van Oostrum <pieter@vanoostrum.org>
    www: http://pieter.vanoostrum.org/
    PGP key: [8DAE142BE17999C4]
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Michael F. Stemper@michael.stemper@gmail.com to comp.lang.python on Wed Apr 3 16:36:30 2024
    From Newsgroup: comp.lang.python

    On 03/04/2024 13.45, Gilmeh Serda wrote:
    On 2 Apr 2024 17:18:16 GMT, Stefan Ram wrote:

    first_word_beginning_with_e

    Here's another one:

    def ret_first_eword():
    ... return [w for w in ['delta', 'epsilon', 'zeta', 'eta', 'theta'] if w.startswith('e')][0]
    ...
    ret_first_eword()
    'epsilon'

    Doesn't work in the case where there isn't a word starting with 'e':

    >>> def find_e( l ):
    ... return [w for w in l if w.startswith('e')][0]
    ...
    >>> l = ['delta', 'epsilon', 'zeta', 'eta', 'theta']
    >>> find_e(l)
    'epsilon'
    >>> l = ['The','fan-jet','airline']
    >>> find_e(l)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "<stdin>", line 2, in find_e
    IndexError: list index out of range
    >>>
    --
    Michael F. Stemper
    If it isn't running programs and it isn't fusing atoms, it's just bending space.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Gilmeh Serda@gilmeh.serda@nothing.here.invalid to comp.lang.python on Thu Apr 4 17:35:33 2024
    From Newsgroup: comp.lang.python

    On Wed, 3 Apr 2024 16:36:30 -0500, Michael F. Stemper wrote:

    On 03/04/2024 13.45, Gilmeh Serda wrote:
    On 2 Apr 2024 17:18:16 GMT, Stefan Ram wrote:

    first_word_beginning_with_e

    Here's another one:

    def ret_first_eword():
    ... return [w for w in ['delta', 'epsilon', 'zeta', 'eta', 'theta']
    if w.startswith('e')][0]
    ...
    ret_first_eword()
    'epsilon'

    Doesn't work in the case where there isn't a word starting with 'e':

    >>> def find_e( l ):
    ... return [w for w in l if w.startswith('e')][0]
    ...
    >>> l = ['delta', 'epsilon', 'zeta', 'eta', 'theta']
    >>> find_e(l)
    'epsilon'
    >>> l = ['The','fan-jet','airline']
    >>> find_e(l)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "<stdin>", line 2, in find_e
    IndexError: list index out of range
    >>>

    Wow! It wasn't production code. And still isn't. (o.Ô)

    def find_e(l):
    ... try:
    ... return [w for w in l if w.startswith('e')][0]
    ... except IndexError:
    ... return None # or 0 or '' or whatever you want
    ...
    find_e(l)
    --
    Gilmeh

    Drop in any mailbox.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Mark Bourne@nntp.mbourne@spamgourmet.com to comp.lang.python on Thu Apr 4 20:03:45 2024
    From Newsgroup: comp.lang.python

    Thomas Passin wrote:
    On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:
    On 02/04/2024 19.18, Stefan Ram wrote:
       Some people can't believe it when I say that chatbots improve
       my programming productivity. So, here's a technique I learned
       from a chatbot!
       It is a structured "break". "Break" still is a kind of jump,
       you know?
       So, what's a function to return the first word beginning with
       an "e" in a given list, like for example
    [ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

       ? Well it's
    def first_word_beginning_with_e( list_ ):
         for word in list_:
             if word[ 0 ]== 'e': return word

       . "return" still can be considered a kind of "goto" statement.
       It can lead to errors:

    def first_word_beginning_with_e( list_ ):
         for word in list_:
             if word[ 0 ]== 'e': return word
         something_to_be_done_at_the_end_of_this_function()
       The call sometimes will not be executed here!
       So, "return" is similar to "break" in that regard.
       But in Python we can write:
    def first_word_beginning_with_e( list_ ):
         return next( ( word for word in list_ if word[ 0 ]== 'e' ), None ) >>
    Doesn't look a smart advice.

       . No jumps anymore, yet the loop is aborted on the first hit

    It's worse than "not a smart advice". This code constructs an
    unnecessary tuple, then picks out its first element and returns that.

    I don't think there's a tuple being created. If you mean:
    ( word for word in list_ if word[ 0 ]== 'e' )

    ...that's not creating a tuple. It's a generator expression, which
    generates the next value each time it's called for. If you only ever
    ask for the first item, it only generates that one.

    When I first came across them, I did find it a bit odd that generator expressions look like the tuple equivalent of list/dictionary
    comprehensions.

    FWIW, if you actually wanted a tuple from that expression, you'd need to
    pass the generator to tuple's constructor:
    tuple(word for word in list_ if word[0] == 'e')
    (You don't need to include an extra set of brackets when passing a
    generator a the only argument to a function).
    --
    Mark.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From avi.e.gross@avi.e.gross@gmail.com to comp.lang.python on Thu Apr 4 16:33:57 2024
    From Newsgroup: comp.lang.python

    That is an excellent point, Mark. Some of the proposed variants to the requested problem, including mine, do indeed find all instances only to return the first. This can use additional time and space but when done, some of the overhead is also gone. What I mean is that a generator you create and invoke once, generally sits around indefinitely in your session unless it leaves your current range or something. It does only a part of the work and must remain suspended and ready to be called again to do more.
    If you create a generator inside a function and the function returns, presumably it can be garbage-collected.
    But if it is in the main body, I have to wonder what happen.
    There seem to be several related scenarios to consider.
    - You may want to find, in our example, a first instance. Right afterwards, you want the generator to disassemble anything in use.
    - You may want the generator to stick around and later be able to return the next instance. The generator can only really go away when another call has been made after the last available instance and it cannot look for more beyond some end.
    - Finally, you can call a generator with the goal of getting all instances such as by asking it to populate a list. In such a case, you may not necessarily want or need to use a generator expression and can use something straightforward and possible cheaper.
    What confuses the issue, for me, is that you can make fairly complex calculations in python using various forms of generators that implement a sort of just-in-time approach as generators call other generators which call yet others and so on. Imagine having folders full of files that each contain a data structure such as a dictionary or set and writing functionality that searches for the first match for a key in any of the dictionaries (or sets or whatever) along the way? Now imagine that dictionary items can be a key value pair that can include the value being a deeper dictionary, perhaps down multiple levels.
    You could get one generator that generates folder names or opens them and another that generates file names and reads in the data structure such as a dictionary and yet another that searches each dictionary and also any internally embedded dictionaries by calling another instance of the same generator as much as needed.
    You can see how this creates and often consumes generators along the way as needed and in a sense does the minimum amount of work needed to find a first instance. But what might it leave open and taking up resources if not finished in a way that dismantles it?
    Perhaps worse, imagine doing the search in parallel and as sone as it is found anywhere, ...
    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Mark Bourne via Python-list
    Sent: Thursday, April 4, 2024 3:04 PM
    To: python-list@python.org
    Subject: Re: A technique from a chatbot
    Thomas Passin wrote:
    On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:
    On 02/04/2024 19.18, Stefan Ram wrote:
    Some people can't believe it when I say that chatbots improve
    my programming productivity. So, here's a technique I learned
    from a chatbot!
    It is a structured "break". "Break" still is a kind of jump,
    you know?
    So, what's a function to return the first word beginning with
    an "e" in a given list, like for example
    [ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

    ? Well it's
    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word

    . "return" still can be considered a kind of "goto" statement.
    It can lead to errors:

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word
    something_to_be_done_at_the_end_of_this_function()
    The call sometimes will not be executed here!
    So, "return" is similar to "break" in that regard.
    But in Python we can write:
    def first_word_beginning_with_e( list_ ):
    return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

    Doesn't look a smart advice.

    . No jumps anymore, yet the loop is aborted on the first hit

    It's worse than "not a smart advice". This code constructs an
    unnecessary tuple, then picks out its first element and returns that.
    I don't think there's a tuple being created. If you mean:
    ( word for word in list_ if word[ 0 ]== 'e' )
    ...that's not creating a tuple. It's a generator expression, which
    generates the next value each time it's called for. If you only ever
    ask for the first item, it only generates that one.
    When I first came across them, I did find it a bit odd that generator expressions look like the tuple equivalent of list/dictionary
    comprehensions.
    FWIW, if you actually wanted a tuple from that expression, you'd need to pass the generator to tuple's constructor:
    tuple(word for word in list_ if word[0] == 'e')
    (You don't need to include an extra set of brackets when passing a
    generator a the only argument to a function).
    --
    Mark.
    --
    https://mail.python.org/mailman/listinfo/python-list
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Thomas Passin@list1@tompassin.net to comp.lang.python on Thu Apr 4 17:10:34 2024
    From Newsgroup: comp.lang.python

    On 4/4/2024 3:03 PM, Mark Bourne via Python-list wrote:
    Thomas Passin wrote:
    On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:
    On 02/04/2024 19.18, Stefan Ram wrote:
       Some people can't believe it when I say that chatbots improve
       my programming productivity. So, here's a technique I learned
       from a chatbot!
       It is a structured "break". "Break" still is a kind of jump,
       you know?
       So, what's a function to return the first word beginning with
       an "e" in a given list, like for example
    [ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

       ? Well it's
    def first_word_beginning_with_e( list_ ):
         for word in list_:
             if word[ 0 ]== 'e': return word

       . "return" still can be considered a kind of "goto" statement.
       It can lead to errors:

    def first_word_beginning_with_e( list_ ):
         for word in list_:
             if word[ 0 ]== 'e': return word
         something_to_be_done_at_the_end_of_this_function()
       The call sometimes will not be executed here!
       So, "return" is similar to "break" in that regard.
       But in Python we can write:
    def first_word_beginning_with_e( list_ ):
         return next( ( word for word in list_ if word[ 0 ]== 'e' ), None )

    Doesn't look a smart advice.

       . No jumps anymore, yet the loop is aborted on the first hit

    It's worse than "not a smart advice". This code constructs an
    unnecessary tuple, then picks out its first element and returns that.

    I don't think there's a tuple being created.  If you mean:
        ( word for word in list_ if word[ 0 ]== 'e' )

    ...that's not creating a tuple.  It's a generator expression, which generates the next value each time it's called for.  If you only ever
    ask for the first item, it only generates that one.

    Yes, I was careless when I wrote that. Still, the tuple machinery has to
    be created and that's not necessary here. My point was that you are
    asking the Python machinery to do extra work for no benefit in
    performance or readability.

    When I first came across them, I did find it a bit odd that generator expressions look like the tuple equivalent of list/dictionary comprehensions.

    FWIW, if you actually wanted a tuple from that expression, you'd need to pass the generator to tuple's constructor:
        tuple(word for word in list_ if word[0] == 'e')
    (You don't need to include an extra set of brackets when passing a
    generator a the only argument to a function).


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.lang.python on Fri Apr 5 18:29:22 2024
    From Newsgroup: comp.lang.python

    Mark Bourne <nntp.mbourne@spamgourmet.com> wrote or quoted:
    I don't think there's a tuple being created. If you mean:
    ( word for word in list_ if word[ 0 ]== 'e' )
    ...that's not creating a tuple. It's a generator expression, which >generates the next value each time it's called for. If you only ever
    ask for the first item, it only generates that one.

    Yes, that's also how I understand it!

    In the meantime, I wrote code for a microbenchmark, shown below.

    This code, when executed on my computer, shows that the
    next+generator approach is a bit faster when compared with
    the procedural break approach. But when the order of the two
    approaches is being swapped in the loop, then it is shown to
    be a bit slower. So let's say, it takes about the same time.

    However, I also tested code with an early return (not shown below),
    and this was shown to be faster than both code using break and
    code using next+generator by a factor of about 1.6, even though
    the code with return has the "function call overhead"!

    But please be aware that such results depend on the implementation
    and version of the Python implementation being used for the benchmark
    and also of the details of how exactly the benchmark is written.

    import random
    import string
    import timeit

    print( 'The following loop may need a few seconds or minutes, '
    'so please bear with me.' )

    time_using_break = 0
    time_using_next = 0

    for repetition in range( 100 ):
    for i in range( 100 ): # Yes, this nesting is redundant!

    list_ = \
    [ ''.join \
    ( random.choices \
    ( string.ascii_lowercase, k=random.randint( 1, 30 )))
    for i in range( random.randint( 0, 50 ))]

    start_time = timeit.default_timer()
    for word in list_:
    if word[ 0 ]== 'e':
    word_using_break = word
    break
    else:
    word_using_break = ''
    time_using_break += timeit.default_timer() - start_time

    start_time = timeit.default_timer()
    word_using_next = \
    next( ( word for word in list_ if word[ 0 ]== 'e' ), '' )
    time_using_next += timeit.default_timer() - start_time

    if word_using_next != word_using_break:
    raise Exception( 'word_using_next != word_using_break' )

    print( f'{time_using_break = }' )
    print( f'{time_using_next = }' )
    print( f'{time_using_next / time_using_break = }' )
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.lang.python on Fri Apr 5 18:32:22 2024
    From Newsgroup: comp.lang.python

    ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:
    However, I also tested code with an early return (not shown below),
    and this was shown to be faster than both code using break and
    code using next+generator by a factor of about 1.6, even though
    the code with return has the "function call overhead"!

    See "return" benchmarked against "break" below!

    import random
    import string
    import timeit

    print( 'The following loop may need a few seconds or minutes, '
    'so please bear with me.' )

    def get_word_using_return( list_ ):
    for word in list_:
    if word[ 0 ]== 'e':
    return word
    return ''

    time_using_break = 0
    time_using_return = 0

    for repetition in range( 100 ):
    for i in range( 100 ): # Yes, this nesting is redundant!

    list_ = \
    [ ''.join \
    ( random.choices \
    ( string.ascii_lowercase, k=random.randint( 1, 30 )))
    for i in range( random.randint( 0, 50 ))]

    start_time = timeit.default_timer()
    for word in list_:
    if word[ 0 ]== 'e':
    word_using_break = word
    break
    else:
    word_using_break = ''
    time_using_break += timeit.default_timer() - start_time

    start_time = timeit.default_timer()
    word_using_return = get_word_using_return( list_ )
    time_using_return += timeit.default_timer() - start_time

    if word_using_return != word_using_break:
    raise Exception( 'word_using_return != word_using_break' )

    print( f'{time_using_break = }' )
    print( f'{time_using_return = }' )
    print( f'{time_using_return / time_using_break = }' )
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Mark Bourne@nntp.mbourne@spamgourmet.com to comp.lang.python on Fri Apr 5 20:42:15 2024
    From Newsgroup: comp.lang.python

    avi.e.gross@gmail.com wrote:
    That is an excellent point, Mark. Some of the proposed variants to the requested problem, including mine, do indeed find all instances only to return the first. This can use additional time and space but when done, some of the overhead is also gone. What I mean is that a generator you create and invoke once, generally sits around indefinitely in your session unless it leaves your current range or something. It does only a part of the work and must remain suspended and ready to be called again to do more.

    It goes out of scope at the end of the function. Unless you return it
    or store a reference to it elsewhere, it will then be deleted.

    Or in this case, since the `first_word_beginning_with_e` function
    doesn't even have a local reference to the generator (it is just created
    and immediately passed as an argument to `next`), it goes out of scope
    once the `next` function returns.

    If you create a generator inside a function and the function returns, presumably it can be garbage-collected.

    Exactly. It probably doesn't even need to wait for garbage collection -
    once the reference count is zero, it can be destroyed.

    But if it is in the main body, I have to wonder what happen.

    If you mean in the top-level module scope outside of any
    function/method, then it would remain in memory until the process exits.

    There seem to be several related scenarios to consider.

    - You may want to find, in our example, a first instance. Right afterwards, you want the generator to disassemble anything in use.
    - You may want the generator to stick around and later be able to return the next instance. The generator can only really go away when another call has been made after the last available instance and it cannot look for more beyond some end.
    - Finally, you can call a generator with the goal of getting all instances such as by asking it to populate a list. In such a case, you may not necessarily want or need to use a generator expression and can use something straightforward and possible cheaper.

    Yes, so you create and assign it at an appropriate scope. In the
    example here, it's just passed to `next` and then destroyed. Passing a generator to the `list` constructor (or the `tuple` constructor in my
    "FWIW") would behave similarly - you'd get the final list/tuple back,
    but the generator would be destroyed once that call is done. If you
    assigned it to a function-local variable, it would exist until the end
    of that function.

    What confuses the issue, for me, is that you can make fairly complex calculations in python using various forms of generators that implement a sort of just-in-time approach as generators call other generators which call yet others and so on.

    Yes, you can. It can be quite useful when used appropriately.

    Imagine having folders full of files that each contain a data structure such as a dictionary or set and writing functionality that searches for the first match for a key in any of the dictionaries (or sets or whatever) along the way? Now imagine that dictionary items can be a key value pair that can include the value being a deeper dictionary, perhaps down multiple levels.

    You could get one generator that generates folder names or opens them and another that generates file names and reads in the data structure such as a dictionary and yet another that searches each dictionary and also any internally embedded dictionaries by calling another instance of the same generator as much as needed.

    You probably could do that. Personally, I probably wouldn't use
    generators for that, or at least not custom ones - if you're talking
    about iterating over directories and files on disk, I'd probably just
    use `os.walk` (which probably is a generator) and iterate over that,
    opening each file and doing whatever you want with the contents.

    You can see how this creates and often consumes generators along the way as needed and in a sense does the minimum amount of work needed to find a first instance. But what might it leave open and taking up resources if not finished in a way that dismantles it?

    You'd need to make sure any files are closed (`with open(...)` helps
    with that). If you're opening files within a generator, I'm pretty sure
    you can do something like:
    ```
    def iter_files(directory):
    for filename in directory:
    with open(filename) as f:
    yield f
    ```

    Then the file will be closed when the iterator leaves the `with` block
    and moved on to the next item (presumably there's some mechanism for the context manager's `__exit__` to be called if the generator is destroyed without having iterated over the items - the whole point of using `with`
    is that `__exit__` is guaranteed to be called whatever happens).

    Other than that, the generators themselves would be destroyed once they
    go out of scope. If there are no references to a generator left,
    nothing is going to be able to call `next` (nor anything else) on it, so
    no need for it to be kept hanging around in memory.

    Perhaps worse, imagine doing the search in parallel and as sone as it is found anywhere, ...



    -----Original Message-----
    From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On Behalf Of Mark Bourne via Python-list
    Sent: Thursday, April 4, 2024 3:04 PM
    To: python-list@python.org
    Subject: Re: A technique from a chatbot

    Thomas Passin wrote:
    On 4/2/2024 1:47 PM, Piergiorgio Sartor via Python-list wrote:
    On 02/04/2024 19.18, Stefan Ram wrote:
    Some people can't believe it when I say that chatbots improve
    my programming productivity. So, here's a technique I learned
    from a chatbot!
    It is a structured "break". "Break" still is a kind of jump,
    you know?
    So, what's a function to return the first word beginning with
    an "e" in a given list, like for example
    [ 'delta', 'epsilon', 'zeta', 'eta', 'theta' ]

    ? Well it's
    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word

    . "return" still can be considered a kind of "goto" statement.
    It can lead to errors:

    def first_word_beginning_with_e( list_ ):
    for word in list_:
    if word[ 0 ]== 'e': return word
    something_to_be_done_at_the_end_of_this_function()
    The call sometimes will not be executed here!
    So, "return" is similar to "break" in that regard.
    But in Python we can write:
    def first_word_beginning_with_e( list_ ):
    return next( ( word for word in list_ if word[ 0 ]== 'e' ), None ) >>>
    Doesn't look a smart advice.

    . No jumps anymore, yet the loop is aborted on the first hit

    It's worse than "not a smart advice". This code constructs an
    unnecessary tuple, then picks out its first element and returns that.

    I don't think there's a tuple being created. If you mean:
    ( word for word in list_ if word[ 0 ]== 'e' )

    ...that's not creating a tuple. It's a generator expression, which
    generates the next value each time it's called for. If you only ever
    ask for the first item, it only generates that one.

    When I first came across them, I did find it a bit odd that generator expressions look like the tuple equivalent of list/dictionary
    comprehensions.

    FWIW, if you actually wanted a tuple from that expression, you'd need to
    pass the generator to tuple's constructor:
    tuple(word for word in list_ if word[0] == 'e')
    (You don't need to include an extra set of brackets when passing a
    generator a the only argument to a function).

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Mark Bourne@nntp.mbourne@spamgourmet.com to comp.lang.python on Fri Apr 5 20:59:54 2024
    From Newsgroup: comp.lang.python

    Stefan Ram wrote:
    Mark Bourne <nntp.mbourne@spamgourmet.com> wrote or quoted:
    I don't think there's a tuple being created. If you mean:
    ( word for word in list_ if word[ 0 ]== 'e' )
    ...that's not creating a tuple. It's a generator expression, which
    generates the next value each time it's called for. If you only ever
    ask for the first item, it only generates that one.

    Yes, that's also how I understand it!

    In the meantime, I wrote code for a microbenchmark, shown below.

    This code, when executed on my computer, shows that the
    next+generator approach is a bit faster when compared with
    the procedural break approach. But when the order of the two
    approaches is being swapped in the loop, then it is shown to
    be a bit slower. So let's say, it takes about the same time.

    There could be some caching going on, meaning whichever is done second
    comes out a bit faster.

    However, I also tested code with an early return (not shown below),
    and this was shown to be faster than both code using break and
    code using next+generator by a factor of about 1.6, even though
    the code with return has the "function call overhead"!

    To be honest, that's how I'd probably write it - not because of any
    thought that it might be faster, but just that's it's clearer. And if
    there's a `do_something_else()` that needs to be called regardless of
    the whether a word was found, split it into two functions:
    ```
    def first_word_beginning_with_e(target, wordlist):
    for w in wordlist:
    if w.startswith(target):
    return w
    return ''

    def find_word_and_do_something_else(target, wordlist):
    result = first_word_beginning_with_e(target, wordlist)
    do_something_else()
    return result
    ```

    But please be aware that such results depend on the implementation
    and version of the Python implementation being used for the benchmark
    and also of the details of how exactly the benchmark is written.

    import random
    import string
    import timeit

    print( 'The following loop may need a few seconds or minutes, '
    'so please bear with me.' )

    time_using_break = 0
    time_using_next = 0

    for repetition in range( 100 ):
    for i in range( 100 ): # Yes, this nesting is redundant!

    list_ = \
    [ ''.join \
    ( random.choices \
    ( string.ascii_lowercase, k=random.randint( 1, 30 )))
    for i in range( random.randint( 0, 50 ))]

    start_time = timeit.default_timer()
    for word in list_:
    if word[ 0 ]== 'e':
    word_using_break = word
    break
    else:
    word_using_break = ''
    time_using_break += timeit.default_timer() - start_time

    start_time = timeit.default_timer()
    word_using_next = \
    next( ( word for word in list_ if word[ 0 ]== 'e' ), '' )
    time_using_next += timeit.default_timer() - start_time

    if word_using_next != word_using_break:
    raise Exception( 'word_using_next != word_using_break' )

    print( f'{time_using_break = }' )
    print( f'{time_using_next = }' )
    print( f'{time_using_next / time_using_break = }' )

    --- Synchronet 3.20a-Linux NewsLink 1.114