Forum: War Ensemble BBS

Re: IA-64

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.arch on Mon Mar 23 21:01:15 2026

From Newsgroup: comp.arch

Terje Mathisen <terje.mathisen@tmsw.no> writes:

Tim Rentsch wrote:

Terje Mathisen <terje.mathisen@tmsw.no> writes:

Tim Rentsch wrote:

[...]

An unrelated item for your reading pleasure...

Take an unbiased coin and start flipping it. Keep flipping until
the number of heads first exceeds the number of tails. Compute the
fraction: the number of heads divided by the number of flips (which
always gives a number between 0.5 and 1.0).

Repeat the above process as many times as desired. Compute the
average of all the fractions and what do you get?

I heard about this yesterday from a friend. That's a hint, of
sorts. (It is now Sunday afternoon where I am.)

So, by definition the list of possible sequences start with
H ; 1/2 of all
THH ; 1/8
TTHHH ; 1/32
THTHH ; 1/32 Sum up to here is 22/32
TTTHHHH ; 1/128
TTHTHHH
TTHHTHH
THTTHHH
THTHTHH
etc

Here's a wild-assed guess: sqrt(0.5) = 0.707

That's an interesting idea for how to analyze it. I'm not sure it
works. One thing I can say for sure is when I tried to replicate it
in a program I got wrong answers, or maybe it converges very slowly.
An easy way to get a result that matches the theoretical value is
just to simulate the coin flips using a random number generator. To
save you the trouble of doing that the ultimate value is pi/4 (and
it converges VERY slowly).

So related to calculating pi by picking two random numbers and use
them as coordinates into a [0..1 x 0..1] square.

If that's true I don't see how or why it's true. I haven't tried
to understand the derivation I was given earlier.

pi/4 =~ 0.78539816, so a bit larger than my wild-assed guess. :-)

I thought your guess was pretty reasonable. I didn't have an
opportunity to make a guess because I knew the answer before
I understood the method.

Incidentally, the hint mentioned above is that I heard about it on
pi day, March 14th. :)

I did not grok that hint. :-(

Definitely a very subtle hint. I didn't really expect anyone to
get it, but I wanted to at least give an opportunity. And I've
been surprised before by how smart some netizens are.
--- Synchronet 3.21f-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Tue Mar 24 09:24:27 2026

From Newsgroup: comp.arch

On 24/03/2026 05:01, Tim Rentsch wrote:

Terje Mathisen <terje.mathisen@tmsw.no> writes:

Tim Rentsch wrote:

Terje Mathisen <terje.mathisen@tmsw.no> writes:

Tim Rentsch wrote:

[...]

An unrelated item for your reading pleasure...

Take an unbiased coin and start flipping it. Keep flipping until
the number of heads first exceeds the number of tails. Compute the
fraction: the number of heads divided by the number of flips (which >>>>> always gives a number between 0.5 and 1.0).

Repeat the above process as many times as desired. Compute the
average of all the fractions and what do you get?

I heard about this yesterday from a friend. That's a hint, of
sorts. (It is now Sunday afternoon where I am.)

So, by definition the list of possible sequences start with
H ; 1/2 of all
THH ; 1/8
TTHHH ; 1/32
THTHH ; 1/32 Sum up to here is 22/32
TTTHHHH ; 1/128
TTHTHHH
TTHHTHH
THTTHHH
THTHTHH
etc

Here's a wild-assed guess: sqrt(0.5) = 0.707

That's an interesting idea for how to analyze it. I'm not sure it
works. One thing I can say for sure is when I tried to replicate it
in a program I got wrong answers, or maybe it converges very slowly.
An easy way to get a result that matches the theoretical value is
just to simulate the coin flips using a random number generator. To
save you the trouble of doing that the ultimate value is pi/4 (and
it converges VERY slowly).

So related to calculating pi by picking two random numbers and use
them as coordinates into a [0..1 x 0..1] square.

If that's true I don't see how or why it's true. I haven't tried
to understand the derivation I was given earlier.

I have not looked at the square with random numbers thing, so I can't
comment on any similarities.

As for the coin tossing, I would say it is just coincidence that there
is a pi in the end result. When you are dealing with combinations of increasing numbers of things, you see factorials. When you are dealing
with probabilities, you see converging sums. When you have converging
sums with elements containing numerators of the form a . b ^ n and denominators with n!, you have something that looks like a Taylor
series. And sometimes these can be pushed and shoved into matching the
Taylor series for a common transcendental function. The probability
questions you hear about are the ones that then give you a sum that
involves popular numbers like pi or e.

pi/4 =~ 0.78539816, so a bit larger than my wild-assed guess. :-)

I thought your guess was pretty reasonable. I didn't have an
opportunity to make a guess because I knew the answer before
I understood the method.

For this particular problem, convergence is /really/ slow - even if you calculate the probabilities rather than doing actual coin tosses. (In
the Numberphile video, he did 10,000 coin tosses, and the result was
about as far from pi/4 as sqrt(1/2) is.) So I agree that sqrt(1/2) is a reasonable guess, as you have to sum up a very large number of steps
before you exceed that.

Incidentally, the hint mentioned above is that I heard about it on
pi day, March 14th. :)

I did not grok that hint. :-(

Definitely a very subtle hint. I didn't really expect anyone to
get it, but I wanted to at least give an opportunity. And I've
been surprised before by how smart some netizens are.

--- Synchronet 3.21f-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Sun Apr 5 06:49:00 2026

From Newsgroup: comp.arch

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

Paul Clayton <paaronclayton@gmail.com> posted:

On 11/5/25 3:43 PM, MitchAlsup wrote:
[snip]

I am now working on predictors for a 6-wide My 66000 machine--which is a bit
different.
a) VEC-LOOP loops do not alter the branch prediction tables.
b) Predication clauses do not alter the BPTs.

Not recording the history of predicates may have a negative
effect on global history predictors. (I do not know if anyone
has studied this, but it has been mentioned — e.g.,
"[predication] has a negative side-effect because the removal
of branches eliminates useful correlation information
necessary for conventional branch predictors" from "Improving
Branch Prediction and Predicated Execution in Out-of-Order
Processors", Eduardo Quiñones et al., 2007.)

It depends on where you are looking! If you think branch prediction
alters where FETCH is Fetching, then MY 66000 predication does not
do predication prediction--predication is used when the join point
will have already been fetched by the time the condition is known.
Then, either the then clause or the else clause will be nullified
without backup (i.e., branch prediction repair).

DECODE is still able to predict then-clause versus else-clause
and maintain the no-backup property, as long as both sides are
issued into the execution window.

Predicate prediction can also be useful when the availability
of the predicate is delayed. Similarly, selective eager
execution might be worthwhile when the predicate is delayed;
the selection is likely to be predictive (resource use might
be a basis for selection but even estimating that might be
predictive).

The difference is that predication prediction never needs branch
prediction repair.

What happens to the instructions after the predicate?

Let's say we have

[...]
peq0 r1,tf
mov r2,#24
mov r2,#48
ldd r3,[r4,r2,0]
[...]

can the ldd be speculatively executed or not? And what happens if the prediction was wrong?
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.21f-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Sun Apr 5 20:35:46 2026

From Newsgroup: comp.arch

Thomas Koenig <tkoenig@netcologne.de> posted:

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

Paul Clayton <paaronclayton@gmail.com> posted:

On 11/5/25 3:43 PM, MitchAlsup wrote:
[snip]

I am now working on predictors for a 6-wide My 66000 machine--which is a bit
different.
a) VEC-LOOP loops do not alter the branch prediction tables.
b) Predication clauses do not alter the BPTs.

Not recording the history of predicates may have a negative
effect on global history predictors. (I do not know if anyone
has studied this, but it has been mentioned — e.g.,
"[predication] has a negative side-effect because the removal
of branches eliminates useful correlation information
necessary for conventional branch predictors" from "Improving
Branch Prediction and Predicated Execution in Out-of-Order
Processors", Eduardo Quiñones et al., 2007.)

It depends on where you are looking! If you think branch prediction
alters where FETCH is Fetching, then MY 66000 predication does not
do predication prediction--predication is used when the join point
will have already been fetched by the time the condition is known.
Then, either the then clause or the else clause will be nullified
without backup (i.e., branch prediction repair).

DECODE is still able to predict then-clause versus else-clause
and maintain the no-backup property, as long as both sides are
issued into the execution window.

Predicate prediction can also be useful when the availability
of the predicate is delayed. Similarly, selective eager
execution might be worthwhile when the predicate is delayed;
the selection is likely to be predictive (resource use might
be a basis for selection but even estimating that might be
predictive).

The difference is that predication prediction never needs branch
prediction repair.

What happens to the instructions after the predicate?

Let's say we have

[...]
peq0 r1,tf
mov r2,#24
mov r2,#48
ldd r3,[r4,r2,0]
[...]

can the ldd be speculatively executed or not? like::

peq0 r1,tf
ldd r3,[r4,#24]
ldd r3,[r4,#28]

And what happens if the prediction was wrong?

Some form of backup and do it right, along with some form of
not updating the cache on the one which was not supposed to
be executed {or TLB or L2}.

--- Synchronet 3.21f-Linux NewsLink 1.2

From Thomas Koenig@tkoenig@netcologne.de to comp.arch on Mon Apr 6 05:11:21 2026

From Newsgroup: comp.arch

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

Thomas Koenig <tkoenig@netcologne.de> posted:

Let's say we have

[...]
peq0 r1,tf
mov r2,#24
mov r2,#48
ldd r3,[r4,r2,0]
[...]

can the ldd be speculatively executed or not? like::

peq0 r1,tf
ldd r3,[r4,#24]
ldd r3,[r4,#28]

Yes, that can be simplified.

Could the load in the original be speculatively executed?

And what happens if the prediction was wrong?

Some form of backup and do it right, along with some form of
not updating the cache on the one which was not supposed to
be executed {or TLB or L2}.

Does your answer apply to my original code or to the one
that you posted? If it only applies to the latter, I can
easily make up an example that cannot be simplified the
way you did (let's take that as a given).
--
This USENET posting was made without artificial intelligence,
artificial impertinence, artificial arrogance, artificial stupidity,
artificial flavorings or artificial colorants.
--- Synchronet 3.21f-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Mon Apr 6 16:24:36 2026

From Newsgroup: comp.arch

Thomas Koenig <tkoenig@netcologne.de> posted:

MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:

Thomas Koenig <tkoenig@netcologne.de> posted:

Let's say we have

[...]
peq0 r1,tf
mov r2,#24
mov r2,#48
ldd r3,[r4,r2,0]
[...]

can the ldd be speculatively executed or not? like::

peq0 r1,tf
ldd r3,[r4,#24]
ldd r3,[r4,#28]

Yes, that can be simplified.

Could the load in the original be speculatively executed?

And what happens if the prediction was wrong?

Some form of backup and do it right, along with some form of
not updating the cache on the one which was not supposed to
be executed {or TLB or L2}.

Does your answer apply to my original code or to the one
that you posted?

Both.

If it only applies to the latter, I can
easily make up an example that cannot be simplified the
way you did (let's take that as a given).

--- Synchronet 3.21f-Linux NewsLink 1.2

From Robert Finch@robfi680@gmail.com to comp.arch on Tue Apr 7 22:53:59 2026

From Newsgroup: comp.arch

Dedicated a general-purpose register (out of 128 GPRs) to store the
round mode for different data types. Each data type has a nibble to hold
the round mode.

Bits
0 to 3 FLT - floating point
4 to 7 DFLT - decimal float
8 to 11 POS - posit (reserved)
12 to 15 FIX - fixed point
16 to 19 INT - integer (arithmetic shift right, average)

If the dynamic round mode is changed, then the round mode is visible
with the same register rename as other GPRs. The round mode is then
easily updated with a bitfield insert (DEP).

It does mean the round mode occupies an operand slot in the RS.

I suppose there could be a separate rounding mode for each precision too.

The round mode is separate from the FP status reg. which is not a GPR.
The FP status reg. is stored in the ROB and eventually makes it back to
the architectural FP status reg. It is not readable without using a FP
FENCE instruction first.

Thinking about merging status registers for different data types into
the same status register.

--- Synchronet 3.21f-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Noozle
  Sat Apr 18 09:35:12 2026
  from Noozle City via Telnet
- Microbot
  Fri Apr 17 14:55:32 2026
  from Moore, Ok via Telnet
- Spaceboy
  Fri Apr 17 08:44:32 2026
  from Usa via SSH
- Noozle
  Fri Apr 17 07:38:04 2026
  from Noozle City via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,113
Nodes:	10 (1 / 9)
Uptime:	492338:40:13
Calls:	14,239
Calls today:	1
Files:	186,312
D/L today:	4,263 files (1,385M bytes)
Messages:	2,514,930

Re: IA-64

Who's Online

Recent Visitors

System Info