Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSE2 and beyond emulation #4185

Open
2 tasks done
Torinde opened this issue Apr 17, 2023 · 9 comments
Open
2 tasks done

SSE2 and beyond emulation #4185

Torinde opened this issue Apr 17, 2023 · 9 comments

Comments

@Torinde
Copy link
Contributor

Torinde commented Apr 17, 2023

Is your feature request related to a problem? Please describe.

Running Win9x/XP software that uses SSE2/3/SSSE3 instructions.

What you want

Add support for SSE2 and beyond instructions

Describe alternatives you've considered

No response

Additional information

Additionally, now that Win2K/XP are becoming more and more feasible to run, maybe SSE2/3 can be added as there are many programs that use those, for example:

Initial issue before EDIT:
It seems the CPU type Experimental (for FISTTP) is based on top of Pentium II (MMX) instead of Pentium III (SSE)?
add SSE (and other Pentium III CPU type features) to CPU type Experimental.

Have you checked that no similar feature request(s) exist?

  • I have searched and didn't find any similar feature request.

Code of Conduct & Contributing Guidelines

  • I agree to follow the code of conduct and the contributing guidelines.
@Torinde
Copy link
Contributor Author

Torinde commented Apr 18, 2023

For the SSE2/3:

  • P6-based Pentium M ULV 733J 1.1GHz = PIII + SSE2, NX-bit
  • P6-based Core Solo U1300 1.07GHz = PIII, SSE2, NX-bit + SSE3, VT-x, Constant TSC

Alternatively, on the 3DNow! side:

  • Sempron 2600+ 1.6GHz Palermo S754 SDA2600AIO2BA = PIII + 3DNow!+, SSE2, NX-bit, SYSCALL/SYSRET
  • Sempron 2600+ 1.6GHz Palermo S754 SDA2600AIO2BO = PIII, 3DNow!+, SSE2, NX-bit, SYSCALL/SYSRET + SSE3

If you want to avoid NX-bit, then:

  • SSE2: Pentium 4 1.3GHz, Pentium M ULV 900MHz
  • SSE3: Pentium 4 2.4GHz 90nm, Celeron D 310 2.13GHz (both have Constant TSC)

I think most of those have also PAE/PSE, which allows more than 4GB RAM on Win2000/2003.

@joncampbell123
Copy link
Owner

The design inherited from DOSBox SVN prevents DOSBox-X from ever emulating more than 4GB of RAM, nor does DOSBox-X ever intend to emulate more than 4GB of RAM. I'm not sure the design inherited from SVN would allow the longer page table length and size required for PSE to work, and therefore I don't think NX emulation is going to happen. PAE might be possible though.

I do think 3DNow! would be nice to emulate since that is well within the time frame of the DOS to WinXP era.

SSE2 is likely as far as DOSBox-X is going to go in that instruction set as SSE3 and up are associated with much later systems lacking the ISA bus and past the mid 2000s.

@Torinde
Copy link
Contributor Author

Torinde commented Apr 25, 2023

I sidetracked my own topic by brining in SSE2/3. :)

Main purpose of this issue was to mention that because SSE/P3 was added to DOSbox-X after FISTTP/Experimental - the Experimental CPU type lacks SSE, so I think that should be corrected.

@Torinde
Copy link
Contributor Author

Torinde commented Apr 25, 2023

Agree with doing 3DNow! first, then SSE2 (both are used in DOS/Win9x software).

Arguments FOR going beyond to:
SSE3

  • used by latest versions of some DOS/Win9x software (see OP here)
  • used by WinXP software (see OP here)
  • FISTTP instruction is already supported
  • present in P4 Prescott 2.4A to 2.8A, 2004-02-01 - without any other extras like NX, 64-bit or Hyper-Threading, compatible with basic/initial P4 chipsets and boards with ISA slots
  • present in slower CPUs:
    • VIA C7/Eden 400MHz, NX (that's also the slowest SSE2 CPU)
    • Core Solo U1300 1.07GHz, applicable if NX/PSE is possible as discussed, Constant TSC
    • Celeron D 310 2.13GHz, Constant TSC
  • ISA slot boards examples: SOLTEK SL-XP865G-3IG Socket 478, iBASE MB880 i915/ICH6 LGA775, more

SSSE3

  • relatively few instructions and notably the last MMX instructions, last instructions to use the x87 registers
  • latest chipset with drivers for DOS/Win9x supports such CPUs: P4M890
  • software requiring those: 32-bit OBS, OEL7.1, Skype for Linux; Mesa9x can use them; others?
  • "simplest" CPU where those are present: Atom N270 1600MHz, 32-bit only, NX (Netbook)
  • slowest CPU where those are present:
    • Vortex86EX2 600MHz, 32-bit only, NX (Embedded) - the most recent 32-bit only x86 CPU (released 2018)
    • RDC R30460 - 1600MHz quad core, 32-bit only, NX, 2D GPU
    • Atom E620 600MHz, 32-bit only, NX, MOVBE, Constant TSC, VT-x (Embedded)
    • Atom Z500 800MHz, 32-bit only, NX, MOVBE, Constant TSC (UltraMobilePC)
    • Celeron M ULV523 933 MHz, x86-64, NX, Constant TSC
  • ISA slot boards examples: ISA-588LF ISA Intel Atom N270 CPU Half Card, Q77, more, Vortex86EX2

@joncampbell123
Copy link
Owner

The experimental CPU type should absolutely support SSE. There is a reason the CPU_ARCHTYPE_EXPERIMENTAL constant is 0xFF, the highest possible value, because code in the normal core is written to consider SSE if CPU_ArchitectureType >= CPU_ARCHTYPE_PENTIUMIII. CPUID emulation also reports itself a Pentium III for experimental type.

Can you show me what combination of software and cputype=experimental is failing to use SSE?

@Torinde
Copy link
Contributor Author

Torinde commented May 7, 2023

Hmm... now that I tested again - it seems Experimental has SSE, so it's my mistake. Sorry!

I'm testing with a small program "cpu.exe" and here is its output:
image

and with another one "sse.com" from "simdtest.zip" (which has also MMX and SSE3 tests):
image

I'll change the issue label to SSE2+

@Torinde Torinde changed the title Experimental CPU type to get SSE support SSE2 and beyond emulation May 7, 2023
@Torinde
Copy link
Contributor Author

Torinde commented Aug 14, 2023

JHRobotics/mesa9x part of their Win9x 3D accelerated driver set can use SSE3 or SSE4.2 or AVX or AVX2 to get better performance. Also, depending how you interpret the description - maybe SSE3 is required for Win98 and only Win95 can work on Pentium II. Newer Mesa versions may also require later instruction sets?
JHRobotics/simd95 is a "Simple hack for enabling SSE/AVX instructions on DOS and Windows 95/98"

@Torinde
Copy link
Contributor Author

Torinde commented Sep 8, 2023

I'm not sure the design inherited from SVN would allow the longer page table length and size required for PSE to work, and therefore I don't think NX emulation is going to happen. PAE might be possible though.

Per Wikipedia NX depends on PAE, not PSE - NX bit: "It is only available with the long mode (64-bit mode) or legacy Physical Address Extension (PAE) page-table formats, but not x86's original 32-bit page table format because page table entries in that format lack the 64th bit used to disable and enable execution."

@Torinde
Copy link
Contributor Author

Torinde commented Apr 10, 2024

It seems a modern desktop CPU has sufficient performance for DOSBox to emulate P4 1.5GHz.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants