Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build GHC with cabal-install and a Makefile #3

Draft
wants to merge 71 commits into
base: master
Choose a base branch
from
Draft

Conversation

hsyl20
Copy link

@hsyl20 hsyl20 commented Nov 27, 2024

  • Build ghc-stage1
  • Generate valid non-cross stage1 settings (reusing stage0's settings for the most part)
  • Build ghc-stage2
  • For every host/arch we want to build a compiler for:
    • generate valid stage2 settings for the target (using ghc-toolchain?)
    • use ghc-stage2 with these settings to build root libraries
    • use ghc-stage2 with these settings to build iserv
    • (optional) use ghc-stage2 with these settings to cross-build a GHC
  • ...

bgamari and others added 13 commits November 25, 2024 03:55
As noted in #25509, the `-this-package-name` must be passed for each
package to ensure that GHC can response references to the packages'
exposed modules via package-qualified imports. Fix this.

Closes #25509.
The default value for base-unit-id is stored in the settings file.

At install time, this can be set by using the BASE_UNIT_ID environment
variable.

At runtime, the value can be set by `-base-unit-id` flag.

For whether all this is a good idea, see #25382

Fixes #25382
As #14497 showed black holes can appear inside large objects when
we capture a computation and later blackhole it like we do for AP_STACK
closures.

Fixes #24791
This patch makes some minor improvements re nix-in-docker logic in the
ci configuration:

- Update `nixos/nix` to the latest version
- Apply $CPUS to `cores`/`max-jobs` to avoid oversubscribing while
  allowing a reasonable degree of parallelism
- Remove redundant `--extra-experimental-features nix-command` in
  later `nix shell` invocations, it's already configured in
  `/etc/nix/nix.conf`
This patch makes test-bootstrap related ci jobs only depend on
hadrian-ghc-in-ghci job to finish, consistent with other jobs in the
full-build stage generated by gen_ci.hs. This allows the jobs to be
spawned earlier and improve overall pipeline parallelism.
This is never used for lexing / parsing.  It is only used by
`GHC.Parser.Header.getOptions`.
Simply uses the multiplicity as stored in the field. As I'm writing
this commit, the only possible multiplicity is 1, but !13525 is
changing this. It's actually easier to take !13525 into account.

Fixes #25515.
This patch bumps macOS minimum SDK version to 11.0 for x86_64-darwin
to align it with aarch64-darwin. This allows us to get rid of the
horrible -Wl,-U,___darwin_check_fd_set_overflow hack, which is causing
linker warnings and testsuite failures on macOS 15. Fixes #25504.
See this CLC proposal:

- haskell/core-libraries-committee#289

and this CLC proposal for background:

- haskell/core-libraries-committee#288

Metric Decrease:
    MultiLayerModulesTH_OneShot
…form

With the Medium code model, the jump range of the generated jump
instruction is larger than that of the Small code model. It's a
temporary fix of the problem descriped in https://gitlab.haskell
.org/ghc/ghc/-/issues/25495. This commit requires that the LLVM
used contains the code of commit 9dd1d451d9719aa91b3bdd59c0c6679
83e1baf05, i.e., version 8.0 and later. Actually we should not
rely on LLVM, so the only way to solve this problem is to implement
the LoongArch backend.

Add new type for codemodel
@hsyl20
Copy link
Author

hsyl20 commented Nov 27, 2024

Current status: it builds some ghc program in _build/stage0/bin/ghc. It seems to be linked with the wrong ghc-boot because it reports a GHC version of 9.8.2 (my bootstrap GHC).

@hsyl20
Copy link
Author

hsyl20 commented Nov 27, 2024

Ah, I need to pass some environment variable to ghc-boot's Setup.hs

AndreasPK and others added 7 commits November 27, 2024 11:40
When constant folding ensure the result is still within bounds
for the given type by explicitly narrowing the results.

Not doing so results in a lot of spurious assembler warnings
especially when testing primops.
We verify that required flags (currently `--output` and `--triple`) are
provided. The implementation is truly awful, but so is getopt.

Begins to address #25500.
Currently the ExecPage facility has two users:

 * GHCi, for constructing info tables, and
 * the adjustor allocation path

Despite neither of these have any spatial locality constraints ExecPage
was using the linker's `mmapAnonForLinker`, which tries hard to ensure
that mappings end up nearby the executable image. This makes adjustor
allocation needlessly subject to fragmentation concerns.

We now instead return less constrained mappings, improving the
robustness of the mechanism.

Addresses #25503.
These were incorrectly changed by the automated refactoring of the
`ghc-internal` migration.

Fixes #25521.
This change means that the Hadrian multi target will include exactprint.
In particular, this means that HLS will work on exactprint inside the GHC tree.
Copy link

@angerman angerman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @hsyl20 this looks like a great start!

@@ -0,0 +1,67 @@
HADRIAN_SETTINGS_STAGE0 := $(shell ghc --info | runghc GenSettings.hs ghc-boot)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called HADRIAN_SETTINGS

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how the environment variable is called in ghc-boot's Setup.hs ¯_(ツ)_/¯

Makefile Outdated
Comment on lines 15 to 49
## Substituting variables
cp _build/stage0/src/ghc-bin/ghc-bin.cabal{.in,}
cp _build/stage0/src/libraries/ghc/ghc.cabal{.in,}
cp _build/stage0/src/libraries/ghc/GHC/CmmToLlvm/Version/Bounds.hs{.in,}
cp _build/stage0/src/libraries/ghc-boot/ghc-boot.cabal{.in,}
cp _build/stage0/src/libraries/ghc-boot-th/ghc-boot-th.cabal{.in,}
cp _build/stage0/src/libraries/ghc-heap/ghc-heap.cabal{.in,}
cp _build/stage0/src/libraries/ghci/ghci.cabal{.in,}
cp _build/stage0/src/utils/ghc-pkg/ghc-pkg.cabal{.in,}

sed -i 's/@ProjectVersion@/9.13/' _build/stage0/src/ghc-bin/ghc-bin.cabal
sed -i 's/@ProjectVersionMunged@/9.13/' _build/stage0/src/ghc-bin/ghc-bin.cabal
sed -i 's/@ProjectVersion@/9.13/' _build/stage0/src/libraries/ghc/ghc.cabal
sed -i 's/@ProjectVersionMunged@/9.13/' _build/stage0/src/libraries/ghc/ghc.cabal
sed -i 's/@ProjectVersion@/9.13/' _build/stage0/src/libraries/ghc-boot/ghc-boot.cabal
sed -i 's/@ProjectVersionMunged@/9.13/' _build/stage0/src/libraries/ghc-boot/ghc-boot.cabal
sed -i 's/@ProjectVersion@/9.13/' _build/stage0/src/libraries/ghc-boot-th/ghc-boot-th.cabal
sed -i 's/@ProjectVersionMunged@/9.13/' _build/stage0/src/libraries/ghc-boot-th/ghc-boot-th.cabal
sed -i 's/@Suffix@//' _build/stage0/src/libraries/ghc-boot-th/ghc-boot-th.cabal
sed -i 's/@SourceRoot@/./' _build/stage0/src/libraries/ghc-boot-th/ghc-boot-th.cabal
sed -i 's/@ProjectVersion@/9.13/' _build/stage0/src/libraries/ghc-heap/ghc-heap.cabal
sed -i 's/@ProjectVersionMunged@/9.13/' _build/stage0/src/libraries/ghc-heap/ghc-heap.cabal
sed -i 's/@ProjectVersionForLib@/9.13/' _build/stage0/src/libraries/ghc-heap/ghc-heap.cabal
sed -i 's/@ProjectVersion@/9.13/' _build/stage0/src/libraries/ghci/ghci.cabal
sed -i 's/@ProjectVersionMunged@/9.13/' _build/stage0/src/libraries/ghci/ghci.cabal
sed -i 's/@ProjectVersionForLib@/9.13/' _build/stage0/src/libraries/ghci/ghci.cabal
sed -i 's/@ProjectVersion@/9.13/' _build/stage0/src/utils/ghc-pkg/ghc-pkg.cabal
sed -i 's/@ProjectVersionMunged@/9.13/' _build/stage0/src/utils/ghc-pkg/ghc-pkg.cabal
sed -i 's/@ProjectVersionForLib@/9.13/' _build/stage0/src/utils/ghc-pkg/ghc-pkg.cabal

sed -i 's/@LlvmMinVersion@/13/' _build/stage0/src/libraries/ghc/GHC/CmmToLlvm/Version/Bounds.hs
sed -i 's/@LlvmMaxVersion@/20/' _build/stage0/src/libraries/ghc/GHC/CmmToLlvm/Version/Bounds.hs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just use a configure script for this right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. But we moved away from it upstream.

Comment on lines 2 to 23
./_build/stage0/src/ghc-bin/
./_build/stage0/src/libraries/ghc
./_build/stage0/src/libraries/directory/
./_build/stage0/src/libraries/file-io/
./_build/stage0/src/libraries/filepath/
./_build/stage0/src/libraries/ghc-platform/
./_build/stage0/src/libraries/ghc-boot/
./_build/stage0/src/libraries/ghc-boot-th/
./_build/stage0/src/libraries/ghc-heap
./_build/stage0/src/libraries/ghci
./_build/stage0/src/libraries/os-string/
./_build/stage0/src/libraries/process/
./_build/stage0/src/libraries/semaphore-compat
./_build/stage0/src/libraries/time
./_build/stage0/src/libraries/unix/
./_build/stage0/src/libraries/Win32/
./_build/stage0/src/utils/ghc-pkg
./_build/stage0/src/utils/hsc2hs
./_build/stage0/src/utils/unlit
./_build/stage0/src/utils/genprimopcode/
./_build/stage0/src/utils/deriveConstants/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we put them in build/stage0?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to modify the source tree directly, this way it's easier to nuke the _build directory and restart from a clean state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to copy all source files across. It is also annoying to do it propely with Make.

cabal will use a separate build directory anyway, only some generated cabal files will end up in the source tree.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to perform some substitution/file generations more than once (for the different stages). I still think it's cleaner to avoid dirtying the source directory. I don't want to have to fix bugs because we've done something for stage0 and forgot to do it again differently for stage1.

@hsyl20
Copy link
Author

hsyl20 commented Nov 28, 2024

Now it seems like cabal doesn't like empty package databases:

Error: [Cabal-9076]
failed to parse output of 'ghc-pkg dump'

Edit: that was my mistake (a debug statement in ghc-pkg...). Now it works.

bgamari and others added 3 commits November 28, 2024 10:26
Earlier versions of `directory` are racy on Windows due to #24382.

Also includes necessary Hadrian bootstrap plan bump.

Fixes #24382.
GitLab recommends using `https://` to clone submodules and provides the
`GIT_SUBMODULE_FORCE_HTTPS` variable to force this.

Fixes #25528.
Makefile Outdated Show resolved Hide resolved
@andreabedini
Copy link
Member

Effectively we want to completely disable the solver. We can add
active-respositories: :none to cabal.project and remove the source constraints.

@hsyl20
Copy link
Author

hsyl20 commented Nov 29, 2024

Effectively we want to completely disable the solver. We can add active-respositories: :none to cabal.project and remove the source constraints.

I've disabled it for stage1 where we start fresh to build boot libraries.

GulinSS and others added 22 commits December 8, 2024 13:52
1. Make `staticInitStat`, `staticDeclStat`, `allocUnboxedConStatic`, `allocateStaticList`, `jsStaticArg` local to modules.
2. Remove unused `hdRawStr`, `hdStrStr` from Haskell and JavaScript (`h$pstr`, `h$rstr`, `h$str`).
3. Introduce a special type `StaticAppKind` enumeration and `StaticApp` to represent boxed scalar static applications. Originally, StaticThunk supported to pass Maybe when it became Nothing for initializied thunks in an alternatie way but it is not used anymore.
…tf8`.

It became possible due of introduction strings unfloating at Sinker pass (#13185). Earns few more bytes at optimizations.
Code analysis shown that such optimization would be possible out of the box if `cachedIdentForId` allowed to do that for Haskell `Id`s which are represented by few JavaScript `Ident`s. It is a usual for strings which are represented at JavaScript as a pair of 2 values: the string content and the offset where to start reading actual string from the full content. Usually offset is 0 but technically we need to allow such complex structures to be treated as "global".

Enabling it there shown that `genToplevelRhs` and `globalOccs` had inaccuracies in their implementations:
1. `globalOccs` operated over JavaScript's `Ident`s but for complex structures it didn't pay attention to the fact that different Idents actually could be pointed to same Id. Now the algo is changed to calculate occurencies for Ids.
2. `genToplevelRhs` didn't assume that different Idents pointed to same Id can have mixed order of occurence. But actually the order is important. Strings are encoded into 2 variables where first is content and second is offset and their order are not interchangeable. It is fixed by regeneration Idents from collected Ids which is fine because all Idents generation is passed through the Cache and they are quasi-stable.
We had previously renamed this function for consistency, but that caused unnecessary breakage
Cabal shouldn't automatically try to set them. We set them explicitly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.