Organize the physical registers into sets. Have only eight physical registers associated with each pair of ISA registers. That is an
average of four rename registers available for each ISA register.
They are organized in pairs to try and increase the odds of a
rename register being available.
In article <10q9h8c$v7qa$1@dont-email.me>, robfi680@gmail.com (Robert
Finch) wrote:
Organize the physical registers into sets. Have only eight physical
registers associated with each pair of ISA registers. That is an
average of four rename registers available for each ISA register.
They are organized in pairs to try and increase the odds of a
rename register being available.
This seems to present the compiler writer with a temptation to make use
of information about the number of rename registers in long expression sequences. That causes problems when the implementation changes.
John
Okay, I have come up with the following scheme for mapping registers in
the RAT to reduce the storage and logic requirements. Probably at the
cost of some performance.
Organize the physical registers into sets. Have only eight physical registers associated with each pair of ISA registers. That is an average
of four rename registers available for each ISA register. They are
organized in pairs to try and increase the odds of a rename register
being available.
For the register map, store a three-bit index into the group of eight physical registers for each ISA register. When referenced the physical register number is the ISA register number divided by two concatenated
with the three-bit index.
The difference between this and a flat register map is that only three--- Synchronet 3.21f-Linux NewsLink 1.2
bits are required to identify the physical register. So it requires 1/3
the storage space and 1/3 the muxing. (with 512 physical registers).
For Qupls RAT it works out to 17,300 LUTs instead of 27,600 LUTs.
However, the number of stalls in renaming is potentially increased… IDK how big the impact is.
In the machines on which I participated, we kept additional information
in <essentially> the RAT--in particular, the identity of the FU which
will deliver the result. So, reading the RAT gave the Physical Register >Number (PRN), the FU who delivers a result and whether the result is in
RF or waiting on FU. We encoded this such that FU and PRN used the same
bits along with a state, so the whole thing used only 8-bits; 1 for PRN >(7-bits: 128 Physical Registers) versus FU (3-bits) and tag (4-bits).
When the operand is latent, FU tells the Reservation Station entry which >result bus to monitor and which tag to match. This means the RS entries
are only watching 1 bus each, need only 1 comparator; but over time each >entry can monitor all result busses.
MitchAlsup <user5857@newsgrouper.org.invalid> writes:
In the machines on which I participated, we kept additional information
in <essentially> the RAT--in particular, the identity of the FU which
will deliver the result. So, reading the RAT gave the Physical Register >Number (PRN), the FU who delivers a result and whether the result is in
RF or waiting on FU. We encoded this such that FU and PRN used the same >bits along with a state, so the whole thing used only 8-bits; 1 for PRN >(7-bits: 128 Physical Registers) versus FU (3-bits) and tag (4-bits).
My impression is that in recent CPUs with valueless reservation
stations the PRN is used in the in-flight instructions and the RAT
from the start, without needing to change anything in in-flight
instructions once an instruction that writes a register it depends on delivers its results. Maybe they have additional bits for detecting
that the result is available without having to compare everything.
One disadvantage of this approach is that PRNs are allocated before
they actually need to store something. For programs that have a lot
of instructions waiting for some result (of, e.g., a cache miss) to
become ready, that might be an issue. OTOH, you need those physical registers anyway for programs that have a lot of finished instructions waiting for committing in the reorder buffer (e.g., due to having to
wait for an instruction that might trap or mispredict).
So finding a way to reduce the register needs of the former kind of
program may not lead to actually reducing the number of registers;
therefore such a way may not actually be useful.
When the operand is latent, FU tells the Reservation Station entry which >result bus to monitor and which tag to match. This means the RS entries
are only watching 1 bus each, need only 1 comparator; but over time each >entry can monitor all result busses.
Your description inspired an idea: My impression is that having many
write ports is much more expensive than having more registers. So
have a register file for each FU, with one write port for each
FU-specific register file. The total number of physical registers
would have to be increased to achieve a similar renaming capacity
across typical workloads, and one probably still needs a similar
number of read ports as before, but the result might still require
less area.
However, based on register file capacity measurements (e.g., at chipsandcheese) it seems that modern microarchitectures differentiate--- Synchronet 3.21f-Linux NewsLink 1.2
at most between GPRs, SIMD registers, and various flags when it comes
to physical registers. So either the area saving from reduced write
ports is not that relevant and many-write-ported register files are
used.
Or there is a backup mechanism for making use of other register files
when the FU's register file has no free registers; e.g., if the
register renamer finds that all registers in the FU's register file
are allocated, it can insert a move uop from a register of the target
FU to an idle FU with enough free registers before
allocating a register to an uop for the target FU.
- anton
This seems to present the compiler writer with a temptation to
make use of information about the number of rename registers in
long expression sequences. That causes problems when the
implementation changes.
Yes. I think it would only cause performance differences.
Performance should only improve on a better implementation.
I have thought that superscalars were complex enough that people
would not be cycle counting, but measuring instead.
anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
MitchAlsup <user5857@newsgrouper.org.invalid> writes:
In the machines on which I participated, we kept additional information
in <essentially> the RAT--in particular, the identity of the FU which
will deliver the result. So, reading the RAT gave the Physical Register
Number (PRN), the FU who delivers a result and whether the result is in
RF or waiting on FU. We encoded this such that FU and PRN used the same
bits along with a state, so the whole thing used only 8-bits; 1 for PRN
(7-bits: 128 Physical Registers) versus FU (3-bits) and tag (4-bits).
My impression is that in recent CPUs with valueless reservation
stations the PRN is used in the in-flight instructions and the RAT
from the start, without needing to change anything in in-flight
instructions once an instruction that writes a register it depends on
delivers its results. Maybe they have additional bits for detecting
that the result is available without having to compare everything.
There are several ways to do the above {ARN number}, {PRN}, {FU,tag}, ...
all arriving at the same spot--Data-flow works.
One disadvantage of this approach is that PRNs are allocated before
they actually need to store something. For programs that have a lot
of instructions waiting for some result (of, e.g., a cache miss) to
become ready, that might be an issue. OTOH, you need those physical
registers anyway for programs that have a lot of finished instructions
waiting for committing in the reorder buffer (e.g., due to having to
wait for an instruction that might trap or mispredict).
PRNs are required from about 2 cycles after Decode until RoB retirement. Certain PRNs may die earlier (over written) and different choices make
this easier or harder.
So finding a way to reduce the register needs of the former kind of
program may not lead to actually reducing the number of registers;
therefore such a way may not actually be useful.
Not by enough to count.
When the operand is latent, FU tells the Reservation Station entry which >>> result bus to monitor and which tag to match. This means the RS entries
are only watching 1 bus each, need only 1 comparator; but over time each >>> entry can monitor all result busses.
Your description inspired an idea: My impression is that having many
write ports is much more expensive than having more registers. So
have a register file for each FU, with one write port for each
FU-specific register file. The total number of physical registers
would have to be increased to achieve a similar renaming capacity
across typical workloads, and one probably still needs a similar
number of read ports as before, but the result might still require
less area.
Mc 88120 had 6 "FU"s, 6 write ports. Each FU contained an Integer unit
(less shift) and whatever the FU was named for. I found this "better"
than having 12 FUs {6 Int, 3 Mem, 1 FMUL, 1 FADD, 1 Branch} because of
bus contention between Ints and Others. Not much but enough.
However, based on register file capacity measurements (e.g., at
chipsandcheese) it seems that modern microarchitectures differentiate
at most between GPRs, SIMD registers, and various flags when it comes
to physical registers. So either the area saving from reduced write
ports is not that relevant and many-write-ported register files are
used.
Or there is a backup mechanism for making use of other register files
when the FU's register file has no free registers; e.g., if the
register renamer finds that all registers in the FU's register file
are allocated, it can insert a move uop from a register of the target
FU to an idle FU with enough free registers before
allocating a register to an uop for the target FU.
- anton
On 2026-03-31 2:35 p.m., MitchAlsup wrote:
anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
MitchAlsup <user5857@newsgrouper.org.invalid> writes:
In the machines on which I participated, we kept additional information >>> in <essentially> the RAT--in particular, the identity of the FU which
will deliver the result. So, reading the RAT gave the Physical Register >>> Number (PRN), the FU who delivers a result and whether the result is in >>> RF or waiting on FU. We encoded this such that FU and PRN used the same >>> bits along with a state, so the whole thing used only 8-bits; 1 for PRN >>> (7-bits: 128 Physical Registers) versus FU (3-bits) and tag (4-bits).
My impression is that in recent CPUs with valueless reservation
stations the PRN is used in the in-flight instructions and the RAT
from the start, without needing to change anything in in-flight
instructions once an instruction that writes a register it depends on
delivers its results. Maybe they have additional bits for detecting
that the result is available without having to compare everything.
There are several ways to do the above {ARN number}, {PRN}, {FU,tag}, ... all arriving at the same spot--Data-flow works.
One disadvantage of this approach is that PRNs are allocated before
they actually need to store something. For programs that have a lot
of instructions waiting for some result (of, e.g., a cache miss) to
become ready, that might be an issue. OTOH, you need those physical
registers anyway for programs that have a lot of finished instructions
waiting for committing in the reorder buffer (e.g., due to having to
wait for an instruction that might trap or mispredict).
PRNs are required from about 2 cycles after Decode until RoB retirement. Certain PRNs may die earlier (over written) and different choices make
this easier or harder.
So finding a way to reduce the register needs of the former kind of
program may not lead to actually reducing the number of registers;
therefore such a way may not actually be useful.
Not by enough to count.
When the operand is latent, FU tells the Reservation Station entry which >>> result bus to monitor and which tag to match. This means the RS entries >>> are only watching 1 bus each, need only 1 comparator; but over time each >>> entry can monitor all result busses.
Your description inspired an idea: My impression is that having many
write ports is much more expensive than having more registers. So
have a register file for each FU, with one write port for each
FU-specific register file. The total number of physical registers
would have to be increased to achieve a similar renaming capacity
across typical workloads, and one probably still needs a similar
number of read ports as before, but the result might still require
less area.
Mc 88120 had 6 "FU"s, 6 write ports. Each FU contained an Integer unit (less shift) and whatever the FU was named for. I found this "better"
than having 12 FUs {6 Int, 3 Mem, 1 FMUL, 1 FADD, 1 Branch} because of
bus contention between Ints and Others. Not much but enough.
However, based on register file capacity measurements (e.g., at
chipsandcheese) it seems that modern microarchitectures differentiate
at most between GPRs, SIMD registers, and various flags when it comes
to physical registers. So either the area saving from reduced write
ports is not that relevant and many-write-ported register files are
used.
Or there is a backup mechanism for making use of other register files
when the FU's register file has no free registers; e.g., if the
register renamer finds that all registers in the FU's register file
are allocated, it can insert a move uop from a register of the target
FU to an idle FU with enough free registers before
allocating a register to an uop for the target FU.
- anton
These post have given me something to investigate. Whether it is smaller
to add to the RAT and reduce the number of comparators in the
reservation stations OR reduce the RAT.
More config options coming up.
Let see if I understand this. While there may only be one bus being monitored, that bus has to originate from the other result busses via a
mux. So, the result busses are going past the reservation stations which then feed into a mux controlled by the FU id which the reservation
station examines for values. I think I can see where that would make the reservation stations smaller. It gets rid of the comparators in the reservation stations and replaces them with muxes on the result busses.
Qupls has a slightly different organization. There are a lot of
functional units. 14 IIRC for a full-blown version, each with four or
more read ports. But there are only four results busses begin examined.
The result bus is dynamically selected to update the register file.
Whichever set of four results is selected is looked at.
Qupls has values stored in the reservation stations. There are only 16 register read ports running to the reservation stations that are used to load values. Then the four result busses also monitored for values to
load. All of this is still smaller than the RAT, as Qupls is configured
at the moment.
I could try changing things so that all 14 (or more) result busses runversus:
past the reservations stations, but I have a feeling that all the muxes
for the busses will consume a lot of logic. Muxes are relatively
expensive in an FPGA. Comparators are less expensive I think.
Current config (8 units):
ALU1, ALU2, IMUL, DIV, FMA, FPU, MEM, BRANCH
Reservation stations are using about 5k LUTs each.
The RAT is about 50k LUTs.
Robert Finch <robfi680@gmail.com> posted:
On 2026-03-31 2:35 p.m., MitchAlsup wrote:
anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:
MitchAlsup <user5857@newsgrouper.org.invalid> writes:
In the machines on which I participated, we kept additional information >>>>> in <essentially> the RAT--in particular, the identity of the FU which >>>>> will deliver the result. So, reading the RAT gave the Physical Register >>>>> Number (PRN), the FU who delivers a result and whether the result is in >>>>> RF or waiting on FU. We encoded this such that FU and PRN used the same >>>>> bits along with a state, so the whole thing used only 8-bits; 1 for PRN >>>>> (7-bits: 128 Physical Registers) versus FU (3-bits) and tag (4-bits). >>>>My impression is that in recent CPUs with valueless reservation
stations the PRN is used in the in-flight instructions and the RAT
from the start, without needing to change anything in in-flight
instructions once an instruction that writes a register it depends on
delivers its results. Maybe they have additional bits for detecting
that the result is available without having to compare everything.
There are several ways to do the above {ARN number}, {PRN}, {FU,tag}, ... >>> all arriving at the same spot--Data-flow works.
One disadvantage of this approach is that PRNs are allocated before
they actually need to store something. For programs that have a lot
of instructions waiting for some result (of, e.g., a cache miss) to
become ready, that might be an issue. OTOH, you need those physical
registers anyway for programs that have a lot of finished instructions >>>> waiting for committing in the reorder buffer (e.g., due to having to
wait for an instruction that might trap or mispredict).
PRNs are required from about 2 cycles after Decode until RoB retirement. >>> Certain PRNs may die earlier (over written) and different choices make
this easier or harder.
So finding a way to reduce the register needs of the former kind of
program may not lead to actually reducing the number of registers;
therefore such a way may not actually be useful.
Not by enough to count.
When the operand is latent, FU tells the Reservation Station entry which >>>>> result bus to monitor and which tag to match. This means the RS entries >>>>> are only watching 1 bus each, need only 1 comparator; but over time each >>>>> entry can monitor all result busses.
Your description inspired an idea: My impression is that having many
write ports is much more expensive than having more registers. So
have a register file for each FU, with one write port for each
FU-specific register file. The total number of physical registers
would have to be increased to achieve a similar renaming capacity
across typical workloads, and one probably still needs a similar
number of read ports as before, but the result might still require
less area.
Mc 88120 had 6 "FU"s, 6 write ports. Each FU contained an Integer unit
(less shift) and whatever the FU was named for. I found this "better"
than having 12 FUs {6 Int, 3 Mem, 1 FMUL, 1 FADD, 1 Branch} because of
bus contention between Ints and Others. Not much but enough.
However, based on register file capacity measurements (e.g., at
chipsandcheese) it seems that modern microarchitectures differentiate
at most between GPRs, SIMD registers, and various flags when it comes
to physical registers. So either the area saving from reduced write
ports is not that relevant and many-write-ported register files are
used.
Or there is a backup mechanism for making use of other register files
when the FU's register file has no free registers; e.g., if the
register renamer finds that all registers in the FU's register file
are allocated, it can insert a move uop from a register of the target
FU to an idle FU with enough free registers before
allocating a register to an uop for the target FU.
- anton
These post have given me something to investigate. Whether it is smaller
to add to the RAT and reduce the number of comparators in the
reservation stations OR reduce the RAT.
More config options coming up.
Let see if I understand this. While there may only be one bus being
monitored, that bus has to originate from the other result busses via a
mux. So, the result busses are going past the reservation stations which
then feed into a mux controlled by the FU id which the reservation
station examines for values. I think I can see where that would make the
reservation stations smaller. It gets rid of the comparators in the
reservation stations and replaces them with muxes on the result busses.
Right, all result busses go to all RSs. Each RS entry.operand watches
1 (or 0) busses. Any RS entry.operand can watch ay result bus.
Qupls has a slightly different organization. There are a lot of
functional units. 14 IIRC for a full-blown version, each with four or
more read ports. But there are only four results busses begin examined.
The result bus is dynamically selected to update the register file.
I would consider the dynamically selected result bus a mistake. A
result bus is heavily loaded and needs big drivers. You design will
need 4 big drivers per FU instead of 1. And for what gain ??
Whichever set of four results is selected is looked at.
Qupls has values stored in the reservation stations. There are only 16
register read ports running to the reservation stations that are used to
load values. Then the four result busses also monitored for values to
load. All of this is still smaller than the RAT, as Qupls is configured
at the moment.
How many entries (instructions) per RS ?
I could try changing things so that all 14 (or more) result busses runversus:
past the reservations stations, but I have a feeling that all the muxes
for the busses will consume a lot of logic. Muxes are relatively
expensive in an FPGA. Comparators are less expensive I think.
Current config (8 units):
ALU1, ALU2, IMUL, DIV, FMA, FPU, MEM, BRANCH
ALU1, ALU2, ALU3, ALU4, ALU5, ALU6
MEM1, MEM2, MEM3, FADD, FMUL, Branch
SFT1, SHT2, SFT3, FMSC, FDIV,
where vertical means they are the same FU#
Reservation stations are using about 5k LUTs each.
14×5 = 70K
The RAT is about 50k LUTs.
On 2026-04-02 1:57 p.m., MitchAlsup wrote:-----------------------
I would consider the dynamically selected result bus a mistake. A
result bus is heavily loaded and needs big drivers. You design will
need 4 big drivers per FU instead of 1. And for what gain ??
An issue is the number of result busses to support all the units.
There is something like 16 or 18 results (some units can produce two results), I thought it would not work to have a result bus for every
unit. 16 write ports on the register file was not happening. I could not
see how to reduce things to say 6 busses.
Four busses were used to minimize the size of the register file, since
there was a mux anyway. I was not thinking of the driver electronics for running in an FPGA.
I am not fond of the dynamic selected result bus, either. Maybe it could
be reduced to eight busses, without dynamic selection.
Whichever set of four results is selected is looked at.
Qupls has values stored in the reservation stations. There are only 16
register read ports running to the reservation stations that are used to >> load values. Then the four result busses also monitored for values to
load. All of this is still smaller than the RAT, as Qupls is configured
at the moment.
How many entries (instructions) per RS ?
Qupls is currently configured for one entry per RS. But it is a
parameter (for each RS). It had to be minimized to fit the FPGA.
I think the 5k size was for three-entry RS.
I could try changing things so that all 14 (or more) result busses runversus:
past the reservations stations, but I have a feeling that all the muxes
for the busses will consume a lot of logic. Muxes are relatively
expensive in an FPGA. Comparators are less expensive I think.
Current config (8 units):
ALU1, ALU2, IMUL, DIV, FMA, FPU, MEM, BRANCH
ALU1, ALU2, ALU3, ALU4, ALU5, ALU6
MEM1, MEM2, MEM3, FADD, FMUL, Branch
SFT1, SHT2, SFT3, FMSC, FDIV,
where vertical means they are the same FU#
Okay, I had units separated by latency so there is minimal latency going from the unit back to the results/input (feedback paths). Trying to keep performance of dependent instructions good.
ALU1, ALU2 are single cycle latency. FPU is three cycles versus FMA
which is five. Most of the units can issue an instruction every clock
cycle. Some units not in the minimal config may have large latencies and cannot issue every cycle. These include float trig, graphics unit,
neural net unit.
Although two ALUs are shown, the FPU can execute ALU instructions too.
And the ALU can execute the single cycle FPU instructions. I use the
name SAU (for simple arithmetic unit) because of the crossover. When I
see ALU I think integer.
There are four result busses to feed the register file. A larger
register file may be too much for the current implementation. There is a
lot of BRAM used for the register file. 1/4 BRAMs in the device.
Reservation stations are using about 5k LUTs each.
14×5 = 70K
The RAT is about 50k LUTs.
I tried configuring Qupls for 3 entries per RS, and more functional units/functionality, but it turned out to be about 700,000 LUTs.
I am trying to keep a demo under 200k LUTs.
When I obtain a larger board it will just be a matter of changing some config values.
Robert Finch <robfi680@gmail.com> posted:
On 2026-04-02 1:57 p.m., MitchAlsup wrote:-----------------------
I would consider the dynamically selected result bus a mistake. AAn issue is the number of result busses to support all the units.
result bus is heavily loaded and needs big drivers. You design will
need 4 big drivers per FU instead of 1. And for what gain ??
There is something like 16 or 18 results (some units can produce two
results), I thought it would not work to have a result bus for every
unit. 16 write ports on the register file was not happening. I could not
see how to reduce things to say 6 busses.
Realistically, you are going to be performing between 2 and 3 I/c
and thus 4-6 busses are perfectly capable.
Four busses were used to minimize the size of the register file, since
there was a mux anyway. I was not thinking of the driver electronics for
running in an FPGA.
I am not fond of the dynamic selected result bus, either. Maybe it could
be reduced to eight busses, without dynamic selection.
Whichever set of four results is selected is looked at.
Qupls has values stored in the reservation stations. There are only 16 >>>> register read ports running to the reservation stations that are used to >>>> load values. Then the four result busses also monitored for values to
load. All of this is still smaller than the RAT, as Qupls is configured >>>> at the moment.
How many entries (instructions) per RS ?
Qupls is currently configured for one entry per RS. But it is a
parameter (for each RS). It had to be minimized to fit the FPGA.
I think the 5k size was for three-entry RS.
Ok, I mean that a RS has both width and depth. Width would be chosen
to be appropriate for the number of operands any of the attached FUs
would need (max) So an INT unit would have 2-operands, a Mem unit
would have 2 register operands and one constant operand (Displacement), FMUL/FMAC would have 3, ...
A RS has depth, so with a ~100 Instruction execution window, and 6 FUs
one would expect 16 RS.instructions each with 2 or 3 dynamic operands.
There is no reason to build RSs if you don't have enough/FU to cover the dynamic latency of the critical path.
I could try changing things so that all 14 (or more) result busses run >>>> past the reservations stations, but I have a feeling that all the muxes >>>> for the busses will consume a lot of logic. Muxes are relativelyversus:
expensive in an FPGA. Comparators are less expensive I think.
Current config (8 units):
ALU1, ALU2, IMUL, DIV, FMA, FPU, MEM, BRANCH
ALU1, ALU2, ALU3, ALU4, ALU5, ALU6
MEM1, MEM2, MEM3, FADD, FMUL, Branch
SFT1, SHT2, SFT3, FMSC, FDIV,
where vertical means they are the same FU#
Okay, I had units separated by latency so there is minimal latency going
from the unit back to the results/input (feedback paths). Trying to keep
performance of dependent instructions good.
ALU1, ALU2 are single cycle latency. FPU is three cycles versus FMA
which is five. Most of the units can issue an instruction every clock
cycle. Some units not in the minimal config may have large latencies and
cannot issue every cycle. These include float trig, graphics unit,
neural net unit.
{You will probably have to edit this to see the true ASCII art due to the inherent stupidity of the space eaters.} One Function Unit::
+----------------------------------------+
| +------------------------+ |
|->| | |\ |
|->| long latency FU |->| | |
|->| | |M| |
Rs-->| +------------------------+ |U| |\ |
| |X|-|D|---|->result bus
| +--------+ | | |/ |
|->| short |----------------->| | |
| +--------+ |/ |
+----------------------------------------+
You may even be able to use the <unused> buffering in the long latency sub-unit to delay the <already done> shot latency calculation. Alternately, you could add some buffering between short and long to take up the slack.
The final gate inside the FU is the large heavily loaded bus driver.
There will be some kind of internal timing chain in the FU that arbitrates the long versus the short(s) and sends tags at the appropriate instant.
Although two ALUs are shown, the FPU can execute ALU instructions too.
And the ALU can execute the single cycle FPU instructions. I use the
name SAU (for simple arithmetic unit) because of the crossover. When I
see ALU I think integer.
When I said ALU above, I meant {ADD, SUB, CMP, FCMP, certain Conversions, certain bit twiddling, logic} that is :most things that fit in 1 cycle with forwarding and result bus drive.
There are four result busses to feed the register file. A larger
register file may be too much for the current implementation. There is a
lot of BRAM used for the register file. 1/4 BRAMs in the device.
I have built (logic design, circuit design, SPICE tuning, layout) of
6R-6W register file of 128×64-bit entries. The SPICE tuning was most "illuminating" as to the limitations of multi-port SRAM-like storage.
I do not, at this instant in time, think wider than 6R-6W is practicable. {{Just as well since we are only performing ~2.x I/c with 300 instruction execution windows {and cache hierarchy hit rates and latencies}}}
Reservation stations are using about 5k LUTs each.
14×5 = 70K
The RAT is about 50k LUTs.
I tried configuring Qupls for 3 entries per RS, and more functional
units/functionality, but it turned out to be about 700,000 LUTs.
I am trying to keep a demo under 200k LUTs.
When I obtain a larger board it will just be a matter of changing some
config values.
| Sysop: | DaiTengu |
|---|---|
| Location: | Appleton, WI |
| Users: | 1,113 |
| Nodes: | 10 (0 / 10) |
| Uptime: | 492335:43:22 |
| Calls: | 14,238 |
| Files: | 186,312 |
| D/L today: |
3,558 files (1,159M bytes) |
| Messages: | 2,514,865 |