• Re: 2GB limitation

    From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Mon Jul 22 17:06:07 2024
    From Newsgroup: comp.lang.tcl

    Wow, that unexpected and cool!
    Thanks
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Rich@rich@example.invalid to comp.lang.tcl on Mon Jul 22 17:17:21 2024
    From Newsgroup: comp.lang.tcl

    Andreas Leitgeb <avl@logic.at> wrote:
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    Will there be a fix for the 2GB size limit that a string representation
    have in Tcl?
    Maybe already fixed in Tcl 9.0?

    Yes, that's one of the reasons for switching to tcl9 as soon as
    possible.

    What is the new larger "limit" in Tcl9?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Emiliano@emiliano@example.invalid to comp.lang.tcl on Mon Jul 22 21:58:03 2024
    From Newsgroup: comp.lang.tcl

    On Mon, 22 Jul 2024 17:17:21 -0000 (UTC)
    Rich <rich@example.invalid> wrote:

    Andreas Leitgeb <avl@logic.at> wrote:
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    Will there be a fix for the 2GB size limit that a string representation
    have in Tcl?
    Maybe already fixed in Tcl 9.0?

    Yes, that's one of the reasons for switching to tcl9 as soon as
    possible.

    What is the new larger "limit" in Tcl9?

    In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of bytes at '*bytes' member, not including the terminating null) has changed
    from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
    32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
    (9,22 exabyte) on 64 bit platforms.

    IIUC that's also the (new) number of elements for a Tcl list. In practice
    the number will be less, since the length of the string representation of
    such list will hit the '*bytes' max length first.
    --
    Emiliano
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Harald Oehlmann@wortkarg3@yahoo.com to comp.lang.tcl on Wed Jul 24 09:07:15 2024
    From Newsgroup: comp.lang.tcl

    Am 22.07.2024 um 19:17 schrieb Rich:
    Andreas Leitgeb <avl@logic.at> wrote:
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    Will there be a fix for the 2GB size limit that a string representation
    have in Tcl?
    Maybe already fixed in Tcl 9.0?

    Yes, that's one of the reasons for switching to tcl9 as soon as
    possible.

    What is the new larger "limit" in Tcl9?

    expr {2**63}
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Andreas Leitgeb@avl@logic.at to comp.lang.tcl on Wed Jul 24 16:22:53 2024
    From Newsgroup: comp.lang.tcl

    Emiliano <emiliano@example.invalid> wrote:
    In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of bytes at '*bytes' member, not including the terminating null) has changed from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
    32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
    (9,22 exabyte) on 64 bit platforms.

    My hearsay was "generally 64 bit (minus the sign-bit)".
    Are you sure that length-type is *always* ptrdiff_t, and
    that this may be 32bit?

    The "64bit'ness" of a platform is also a bit more complicated...
    There are platforms, where pointers are 64bit, but ints are
    still 32 (despite machine words being all 64bit) - in those
    cases, I'd expect ptrdiff_t to be 64 bit, but on a real old
    32bit machine, I don't really know for sure...

    IIUC that's also the (new) number of elements for a Tcl list.
    In practice the number will be less, since the length of the
    string representation of such list will hit the '*bytes' max
    length first.

    Not all lists are ever turned to string-rep. While they are
    semantically "just strings", well written programs can avoid
    the actual obtainment of the string rep, at least for those
    really long lists that may be relevant here.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Emiliano@emiliano@example.invalid to comp.lang.tcl on Wed Jul 24 17:05:19 2024
    From Newsgroup: comp.lang.tcl

    On Wed, 24 Jul 2024 16:22:53 -0000 (UTC)
    Andreas Leitgeb <avl@logic.at> wrote:

    Emiliano <emiliano@example.invalid> wrote:
    In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of bytes at '*bytes' member, not including the terminating null) has changed from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on 32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807 (9,22 exabyte) on 64 bit platforms.

    My hearsay was "generally 64 bit (minus the sign-bit)".
    Are you sure that length-type is *always* ptrdiff_t, and
    that this may be 32bit?

    In 9.X, it is ptrdiff_t. In 8.Y is still int.

    See https://core.tcl-lang.org/tcl/file?ci=trunk&name=generic/tcl.h&ln=325-333 and
    https://core.tcl-lang.org/tcl/file?ci=trunk&name=generic/tcl.h&ln=740-752

    ptrdiff_t can still be a 32 bits wide value. See below.

    The "64bit'ness" of a platform is also a bit more complicated...
    There are platforms, where pointers are 64bit, but ints are
    still 32 (despite machine words being all 64bit) - in those
    cases, I'd expect ptrdiff_t to be 64 bit, but on a real old
    32bit machine, I don't really know for sure...

    This is what I mean when say "on 32-bit platforms is still 2GB",
    since i386-i686 platform has a 32 bit ptrdiff_t.

    On my ancient i686 machine:

    $ uname -m
    i686
    $ tclsh9.0
    % expr {(1 << (8 * $tcl_platform(pointerSize))-1) - 1}
    2147483647
    % package provide Tcl
    9.0b3
    % set tcl_platform(pointerSize)
    4

    IIUC that's also the (new) number of elements for a Tcl list.
    In practice the number will be less, since the length of the
    string representation of such list will hit the '*bytes' max
    length first.

    Not all lists are ever turned to string-rep. While they are
    semantically "just strings", well written programs can avoid
    the actual obtainment of the string rep, at least for those
    really long lists that may be relevant here.

    Yes, but that's an optimization. Tcl semantics are still defined
    in terms of strings operations. I prefer not to depend on internals.
    --
    Emiliano
    --- Synchronet 3.20a-Linux NewsLink 1.114