• Ilya Sutskever: The Next Oppenheimer

    From Mild Shock@janburse@fastmail.fm to comp.lang.prolog on Wed Dec 18 15:42:07 2024
    From Newsgroup: comp.lang.prolog

    Hi,

    I liked some videos on YouTube:

    Ilya Sutskever: The Next Oppenheimer https://www.youtube.com/watch?v=jryDWOKikys

    Ilya Sutskever: Sequence to Sequence Learning https://www.youtube.com/watch?v=WQQdd6qGxNs

    Bye
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Mild Shock@janburse@fastmail.fm to comp.lang.prolog on Thu Dec 19 10:42:10 2024
    From Newsgroup: comp.lang.prolog

    Hi,

    Could it be that "o1" likely refers to "Optimizer 1".
    And what could this include?

    - Compressing weights or activations into
    fewer bits can significantly reduce computation,
    especially in hardware, mimicking O(1)-like
    efficiency for certain operations.

    - Removing redundant connections in the neural
    network leads to fewer computations. Sparse matrix
    operations can optimize dense workloads, making specific
    inference tasks faster.

    - Large models are distilled into smaller ones
    with similar capabilities, reducing computational
    costs during inference. If the optimized paths are
    cleverly structured, their complexity might be
    closer to O(1) for lookup-style tasks.

    So maybe Ilya Sutskever wants to tell us, in his
    recent talk when he refers to the 700g brain line:
    Look we did the same as biological evolution,

    we found a way to construct more compact brains.

    Bye

    Mild Shock schrieb:
    Hi,

    I liked some videos on YouTube:

    Ilya Sutskever: The Next Oppenheimer https://www.youtube.com/watch?v=jryDWOKikys

    Ilya Sutskever: Sequence to Sequence Learning https://www.youtube.com/watch?v=WQQdd6qGxNs

    Bye

    --- Synchronet 3.20a-Linux NewsLink 1.114