• Paragraph Wrapping

    From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.programming on Thu Jan 26 19:50:39 2023
    From Newsgroup: comp.programming

    To wrap a paragraph, I wrote a Python script that searched for
    the best wrap, given a quality function that included penalties
    for interpunction near the end of a line (but not at the end)
    or adjacent lines that differed greatly in length.

    It worked, but as the number of lines approached ten, it became
    painfully slow.

    By an incredible twist of fate, I came across a chapter of
    a book that explained dynamic programming and how to use it
    to reduce the complexity of global paragraph wrapping from
    exponential to quadratic!

    So I modified my script yesterday, and today all the paragraphs
    you read here are wrapped by the new script, because I configured
    my editor to run paragraphs through this script whenever I press
    a certain sequence of keys.

    So how does this work?

    An exponential paragraph wrapping algorithm tries every
    combination of break points and gets a quality score for each.
    I think my first program must have looked like this. I may have
    eliminated some obviously bad line breaks, but basically it
    must have been an exponential algorithm.

    The dynamic programming algorithm, as I understand it, works like
    this and can still find the global optimum: I have a loop that
    goes through each possible break point just once (which would be
    linear). Then this loop has an inner loop that goes through each
    previous breakpoint (now it's quadratic). The previous breakpoints
    have already been evaluated. So it can easily find the best
    combination of a previous breakpoint and a possible break at the
    current position. This is the quality score for a break at the
    current position. Finally, you start at the break point just after
    the last word of the paragraph and go from there to all the best
    preceding break points to identify the lines of the result.

    However, it seems to me that I am a bit limited in the quality
    measure function, because this function seems to support best
    quality measures that only consider the current line and maybe the
    combination of the current line with the preceding line. The global
    quality measure function I had before could also easily measure
    such global quantities as the standard deviation of the lengths of
    all the lines of the result. This is not accessible to the dynamic
    programming algorithm as I understand it, because when it is in the
    middle of the paragraph, it does not yet know the lengths of all
    the lines, and when it is at the end of the paragraph, it cannot
    change decisions that have already been made.

    The same book also gave another problem that could supposedly be
    solved using dynamic programming: In a restaurant you are shown
    five dishes in a sequence, and you can choose one to eat. You are
    shown only one dish at a time and do not know which dish will be
    shown next. Once you accept or reject a dish, you cannot go back on
    your decision. If you do not choose any of the first four dishes,
    this means that you would inevitably eat the last one. How should
    you proceed to maximize the probability of getting the best dish?

    The solution given in the book begins by explaining that you
    assign a quality score between 0 and 1 to each dish you see.
    So the question is how to proceed to maximize the probability
    of eating a dish with a quality score as high as possible . . .


    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From Noel Duffy@uaigh@icloud.com to comp.programming on Fri Jan 27 09:06:28 2023
    From Newsgroup: comp.programming

    On 27/01/23 08:50, Stefan Ram wrote:
    By an incredible twist of fate, I came across a chapter of
    a book that explained dynamic programming and how to use it
    to reduce the complexity of global paragraph wrapping from
    exponential to quadratic!

    Well don't keep us in suspense! What's the book?

    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.programming on Thu Jan 26 20:20:12 2023
    From Newsgroup: comp.programming

    Noel Duffy <uaigh@icloud.com> writes:
    On 27/01/23 08:50, Stefan Ram wrote:
    By an incredible twist of fate, I came across a chapter of
    a book that explained dynamic programming and how to use it
    to reduce the complexity of global paragraph wrapping from
    exponential to quadratic!
    Well don't keep us in suspense! What's the book?

    Ok, but since it contains a solution to the food dish task,
    I may wait a week.


    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From Richard Heathfield@rjh@cpax.org.uk to comp.programming on Thu Jan 26 21:00:27 2023
    From Newsgroup: comp.programming

    On 26/01/2023 7:50 pm, Stefan Ram wrote:
    The same book also gave another problem that could supposedly be
    solved using dynamic programming: In a restaurant you are shown
    five dishes in a sequence, and you can choose one to eat. You are
    shown only one dish at a time and do not know which dish will be
    shown next. Once you accept or reject a dish, you cannot go back on
    your decision. If you do not choose any of the first four dishes,
    this means that you would inevitably eat the last one. How should
    you proceed to maximize the probability of getting the best dish?

    The solution given in the book begins by explaining that you
    assign a quality score between 0 and 1 to each dish you see.
    So the question is how to proceed to maximize the probability
    of eating a dish with a quality score as high as possible . . .

    0123456789 spoiler space
    0123456789 spoiler spac
    0123456789 spoiler spa
    0123456789 spoiler sp
    0123456789 spoiler s
    0123456789 spoiler
    0123456789 spoiler
    0123456789 spoile
    0123456789 spoil
    0123456789 spoi
    0123456789 spo
    0123456789 sp
    0123456789 s
    0123456789
    0123456789
    012345678
    01234567
    0123456
    012345
    01234
    0123
    012
    01


    Reject (but score) the first two dishes, and then accept the
    first dish that scores better than any you have yet seen (or the
    last if you must and are very hungry).

    This algorithm will pick the best of five dishes about seven
    times in twenty visits.
    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From Ben Bacarisse@ben.usenet@bsb.me.uk to comp.programming on Fri Jan 27 01:22:19 2023
    From Newsgroup: comp.programming

    ram@zedat.fu-berlin.de (Stefan Ram) writes:

    To wrap a paragraph, I wrote a Python script that searched for
    the best wrap, given a quality function that included penalties
    for interpunction near the end of a line (but not at the end)
    or adjacent lines that differed greatly in length.

    It worked, but as the number of lines approached ten, it became
    painfully slow.

    By an incredible twist of fate, I came across a chapter of
    a book that explained dynamic programming and how to use it
    to reduce the complexity of global paragraph wrapping from
    exponential to quadratic!

    The TeX Book describes two-level algorithm to calculate optimal line
    breaks in paragraphs and the optimal page breaks.
    --
    Ben.
    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.programming on Thu Feb 2 21:02:06 2023
    From Newsgroup: comp.programming

    Noel Duffy <uaigh@icloud.com> writes:
    On 27/01/23 08:50, Stefan Ram wrote:
    By an incredible twist of fate, I came across a chapter of
    a book that explained dynamic programming and how to use it
    to reduce the complexity of global paragraph wrapping from
    exponential to quadratic!
    Well don't keep us in suspense! What's the book?

    |The Computer Science of TeX and LaTeX;
    |based on CS 594, fall 2004, University of Tennessee
    |
    |Victor Eijkhout
    |Texas Advanced Computing Center
    |The University of Texas at Austin
    |
    |2012


    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From Richard Heathfield@rjh@cpax.org.uk to comp.programming on Thu Feb 2 21:44:32 2023
    From Newsgroup: comp.programming

    On 02/02/2023 9:02 pm, Stefan Ram wrote:
    Noel Duffy <uaigh@icloud.com> writes:
    On 27/01/23 08:50, Stefan Ram wrote:
    By an incredible twist of fate, I came across a chapter of
    a book that explained dynamic programming and how to use it
    to reduce the complexity of global paragraph wrapping from
    exponential to quadratic!
    Well don't keep us in suspense! What's the book?

    |The Computer Science of TeX and LaTeX;
    |based on CS 594, fall 2004, University of Tennessee
    |
    |Victor Eijkhout
    |Texas Advanced Computing Center
    |The University of Texas at Austin
    |
    |2012

    Did you even notice my reply re the food dish question?
    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.programming on Thu Feb 2 22:03:06 2023
    From Newsgroup: comp.programming

    Richard Heathfield <rjh@cpax.org.uk> writes:
    Did you even notice my reply re the food dish question?

    Yes. Sorry. I should have written a reply!

    So, what that book by Eijkhout said is in my words:

    The last dish could be anything, so that its expected value of
    the rating function on the scale from 0 to 1 is equal to 0.5.

    Therefore, one would choose the penultimate dish if its
    rating is better than 0.5, because then it would be better
    than the alternative.

    If one assumes that the penultimate dish can be anything, it is
    better than 0.5 in half of the cases, in such a case, one would
    choose it, and since it then lies somewhere between 0.5 and 1, the
    expected value is then 0.75. Otherwise, one takes the last dish with
    an expected value of 0.5. Overall, with this strategy and two dishes,
    the expected value is then the mean of 0.75 and 0.5, i.e., 0.625.

    So, the second-to-last dish would have to be better than 0.625
    in order to be chosen.

    I wrote a small Python program to compare this with the strategy you
    mentioned. The result is that the strategy given above fares better
    by a factor of 1.2146 if every dish's quality is randomly between 0
    and 1. But the traditional strategy you gave fares better by a factor
    between 0.8 and 0.99 when the quality of all the dishes is a priori
    restricted to some range [a,b], where a < b and 0 <= a <= 1, at least
    for three such cases I looked at.

    Which is "the best" solution, then, might depend on details of
    the wording of the question, i.e., whether it states that every
    dish can have an arbitrary quality between 0 and 1 or whether
    a certain cook always produces dishes between a and b (but how
    could he know or control the ratings the customer will give?).


    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.programming on Thu Feb 2 22:10:13 2023
    From Newsgroup: comp.programming

    ram@zedat.fu-berlin.de (Stefan Ram) writes:
    restricted to some range [a,b], where a < b and 0 <= a <= 1, at least

    restricted to some range [a,b], where a < b and 0 <= a, b <= 1, at least


    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.programming on Thu Feb 2 22:56:09 2023
    From Newsgroup: comp.programming

    ram@zedat.fu-berlin.de (Stefan Ram) writes:
    If one assumes that the penultimate dish can be anything, it is
    better than 0.5 in half of the cases, in such a case, one would

    That was a comma splice; it should read: "... cases; in ...".

    I wrote a small Python program to compare this with the strategy you >mentioned. The result is that the strategy given above fares better
    by a factor of 1.2146 if every dish's quality is randomly between 0

    ... or some value near 1.2146

    and 1. But the traditional strategy you gave fares better by a factor >between 0.8 and 0.99 when the quality of all the dishes is a priori >restricted to some range [a,b], where a < b and 0 <= a <= 1, at least
    for three such cases I looked at.

    Here's that Python program:

    # Python 3.8

    import itertools
    import random

    last_dish = 4

    def calculate_thresholds_for_the_dynamic_method():
    global threshold
    threshold =[ None ]*( last_dish + 1 )
    for position in range( last_dish, -1, -1 ):
    if position == last_dish:
    expectation_value = 0.5
    else:
    expectation_value_if_chosen =( 1 + expectation_value )/ 2
    probability_of_choice =( 1 - expectation_value )
    expectation_value = \
    expectation_value_if_chosen * probability_of_choice + \
    expectation_value *( 1 - probability_of_choice )
    threshold[ position ]= expectation_value

    def generation_of_five_random_numbers():
    global numbers
    numbers = []
    for _ in range( last_dish + 1 ):
    numbers.append( random.random() )

    def traditional_strategy():
    seen = []
    for dish in range( last_dish + 1 ):
    number = numbers[ dish ]
    if dish in[ 0, 1 ]:
    seen.append( number )
    elif dish in[ 2, 3 ]:
    if number > max( seen ):
    return dish
    seen.append( number )
    else: return dish

    def dynamic_strategy():
    for dish in range( last_dish + 1 ):
    number = numbers[ dish ]
    if number > threshold[ dish ]: return dish
    if dish == last_dish: return dish

    def compare_both_strategies_using_the_expectation_value():
    calculate_thresholds_for_the_dynamic_method()
    traditional_total = 0
    dynamic_total = 0
    for i in itertools.count():
    generation_of_five_random_numbers()
    traditional_result = numbers[ traditional_strategy() ]
    dynamic_result = numbers[ dynamic_strategy() ]
    traditional_total += traditional_result
    dynamic_total += dynamic_result
    if not( i % 100000 ):
    print( f'{dynamic_total/traditional_total=:.5}' )

    def compare_both_strategies_using_finding_the_best():
    calculate_thresholds_for_the_dynamic_method()
    traditional_count = 0
    dynamic_total = 0
    for i in itertools.count():
    generation_of_five_random_numbers()
    traditional_result = numbers[ traditional_strategy() ]
    dynamic_result = numbers[ dynamic_strategy() ]
    if traditional_result == max( numbers ): traditional_total += 1
    if dynamic_result == max( numbers ): dynamic_total += 1
    if not( i % 100000 ):
    print( f'{dynamic_total/traditional_total=:.5}' )

    compare_both_strategies_using_the_expectation_value()

    .


    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From Y A@air000000000000@ya.ee to comp.programming on Fri Feb 3 06:24:05 2023
    From Newsgroup: comp.programming

    div {
    word-wrap: break-word;
    }




    On Thursday, January 26, 2023 at 10:06:33 PM UTC+2, Noel Duffy wrote:
    On 27/01/23 08:50, Stefan Ram wrote:
    By an incredible twist of fate, I came across a chapter of
    a book that explained dynamic programming and how to use it
    to reduce the complexity of global paragraph wrapping from
    exponential to quadratic!

    Well don't keep us in suspense! What's the book?
    --- Synchronet 3.20a-Linux NewsLink 1.113
  • From Ezimene nimi Teine nimi@ezimenenimiteinenimi@gmail.com to comp.programming on Wed Mar 1 11:59:54 2023
    From Newsgroup: comp.programming

    Good day, lord.
    On Thursday, January 26, 2023 at 11:00:35 PM UTC+2, Richard Heathfield wrote:
    On 26/01/2023 7:50 pm, Stefan Ram wrote:
    The same book also gave another problem that could supposedly be
    solved using dynamic programming: In a restaurant you are shown
    five dishes in a sequence, and you can choose one to eat. You are
    shown only one dish at a time and do not know which dish will be
    shown next. Once you accept or reject a dish, you cannot go back on
    your decision. If you do not choose any of the first four dishes,
    this means that you would inevitably eat the last one. How should
    you proceed to maximize the probability of getting the best dish?

    The solution given in the book begins by explaining that you
    assign a quality score between 0 and 1 to each dish you see.
    So the question is how to proceed to maximize the probability
    of eating a dish with a quality score as high as possible . . .

    0123456789 spoiler space
    0123456789 spoiler spac
    0123456789 spoiler spa
    0123456789 spoiler sp
    0123456789 spoiler s
    0123456789 spoiler
    0123456789 spoiler
    0123456789 spoile
    0123456789 spoil
    0123456789 spoi
    0123456789 spo
    0123456789 sp
    0123456789 s
    0123456789
    0123456789
    012345678
    01234567
    0123456
    012345
    01234
    0123
    012
    01


    Reject (but score) the first two dishes, and then accept the
    first dish that scores better than any you have yet seen (or the
    last if you must and are very hungry).

    This algorithm will pick the best of five dishes about seven
    times in twenty visits.

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within
    --- Synchronet 3.20a-Linux NewsLink 1.114