Re: Best CPU platform(s) for FPGA synthesis



jjohnson@xxxxxxxxxx wrote:
That appears to be related to the number of processors inside one box.
If a single CPU is just hyperthreaded, the processor takes care of
instruction distribution unrelated to a variable like number_of_cpus,
right?

No. Hyperthreading means that the hardware is only virtually doubled.
The CPU maintains the state and the register set of two independent
threads and tries to utilize all its function units. If one thread has
to wait for data from the memory some instructions of the other thread
can be issued to the function units. Likewise, if one thread spends its
time in the FPU, the other thread can use the remaining function units.
If both threads execute the same type of instructions a hyperthreaded
CPU rarely has an advantage.

Running on a hyperthreaded CPU the operating system sees two cores and
has to schedule its workload like there were two physical cores to gain
any benefit. If your software only has one thread hyperthreading like
multicores won't speed it up.

And if there are two single-core processors in a box, obviously
it will utilize "number_of_cpus=2" as expected. Does anyone know how
that works with dual-core CPUs? i.e, if I have two quad-core CPUs in
one box, will setting "number_of_cpus=7" make optimal use of 7 cores
while leaving me one to work in a shell or window?

I don't know how Quartus makes use of the available CPUs but basically
as seen from software there is no difference between two single cores
and one dual-core.

In 32-bit Windows, is that 3GB limit for everything running at one
time? i.e., is 4GB a waste on a Windows machine? Can it run multiple
2GB processes and go beyond 3 or 4GB? Or is 3GB an absolute O/S limit,
and 2GB an absolute process limit in Windows?

3 GB is a practical limit because the PCI bus and other memory-mapped
devices typically occupy some hundred megabytes of address space. So you
can't use this memory space to access RAM. There are techniques to map
memory to other address regions beyond the 4 GB border but you need
special chipsets and proper operating system support.

Andreas
.



Relevant Pages

  • Re: Cost of calling a standard library function
    ... > sense, since push Allocates memory, and pop deallocates it. ... Hence, all the CPU does is, basically: ... so forth...it's even possible to get "free" instructions (effectively ... what else is an ASM coder's job? ...
    (alt.lang.asm)
  • Re: Code translation
    ... >> and instructions compared to the Z80. ... > When you automatically translate machine code from one CPU to another, ... and takes much more memory. ... Such as a reverse P-Code Decompiler? ...
    (comp.os.cpm)
  • Re: Double-Checked Locking pattern issue
    ... The only reason I could think of is writing earlier to memory could save ... register and we could save the register for later use. ... In short, while some CPU can retire four instructions per clock, there ...
    (microsoft.public.vc.language)
  • Re: Rethinking V.M.S
    ... >> initially paging in the program when it starts. ... The CPU always has to do the 1st sentence (dereference ... > to track the physical pages in memory assigned to a process. ... add a few extra instructions to the "invalid memory" trap handler ...
    (comp.os.vms)
  • Code translation
    ... > and instructions compared to the Z80. ... and takes much more memory. ... It's not that one CPU is "inferior" to the other; ... I may have to move things that were in registers into RAM, ...
    (comp.os.cpm)