Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Support resolve the modules with hard-links #410

Open
colinaaa opened this issue Mar 11, 2024 · 5 comments
Open

[Feature Request]: Support resolve the modules with hard-links #410

colinaaa opened this issue Mar 11, 2024 · 5 comments

Comments

@colinaaa
Copy link

What problem does this feature solve?

pnpm creates hard links from the global store to the project's node_modules folders.

For example, imagine you have the following directory structure: packages/a and packages/b have the same version of lodash as a dependency. But they are different symlink that points to different files.

.
├── node_modules
│   ├── a -> ../packages/a
├── package.json
├── packages
│   ├── a
│   │   ├── index.ts
│   │   ├── node_modules
│   │   │   ├── b -> ../../b
│   │   │   └── lodash -> ../../../node_modules/.pnpm/[email protected]/node_modules/lodash
│   │   └── package.json
│   └── b
│       ├── index.ts
│       ├── node_modules
│       │   └── lodash -> .pnpm/[email protected]/node_modules/lodash
│       ├── package.json
│       ├── pnpm-lock.yaml
│       └── pnpm-workspace.yaml
├── pnpm-lock.yaml
└── pnpm-workspace.yaml

Since two lodash/lodash.js points to different files, both rspack(or oxc internally) and webpack(or enhanced-resolve internally) treat the two files as unrelated. So at build time, two copies of lodash code will be packaged in the output bundle, even if their content is the same.

However, it can be found that in the file system, the two lodash.js point exactly to the same inode node. Which means they are the same file in physical storage.

$ ls -i packages/a/node_modules/lodash/lodash.js
2249211 packages/a/node_modules/lodash/lodash.js

$ ls -i packages/b/node_modules/lodash/lodash.js
2249211 packages/b/node_modules/lodash/lodash.js

To solve the problem, we created an enhanced-resolve plugin to cache the request:

import fs from 'node:fs'

import type { ResolveRequest, Resolver } from 'enhanced-resolve'

class InodeWebpackPlugin {
  static #source = 'resolved'

  apply(resolver: Resolver) {
    resolver
      .getHook(InodeWebpackPlugin.#source)
      .tapAsync('INodeCachePlugin', (request, _, callback) => {
        if (!request.path) {
          return callback()
        }

        try {
          const { ino } = fs.statSync(request.path)

          const cachedRequest = this.#cache.get(ino)
          if (cachedRequest) {
            // Note that the query may change for the same path
            // with a different query.
            Object.assign(request, { ...cachedRequest, query: request.query })
            return callback()
          }

          this.#cache.set(ino, request)
        } catch {
          // explicitly ignore error
        }

        return callback()
      })
  }

  #cache = new Map<number, ResolveRequest>()
}

And it works as expected in webpack, the duplicated modules have been eliminated.

So I wonder if this could be added to enhanced-resolve(just like the symlinks options).

Related issue: web-infra-dev/rspack#5912

@alexander-akait
Copy link
Member

Yea, sonds good, do you want to send a PR, also is it possible to enable it by default, i.e. pnpm has ability to detect that it is pnpm, so we will enable this option by default in webpack?

@colinaaa
Copy link
Author

Sure! I'm glad to send a PR. But actually, I'm not 100% sure about the correctness. Since it will make the compilation unstable(depends on the order of the requests).

@alexander-akait
Copy link
Member

@colinaaa I think we have the same issue in webpack, need to search

@alexander-akait
Copy link
Member

webpack-contrib/css-loader#1507, I think related

@fzxen
Copy link

fzxen commented Jul 5, 2024

@colinaaa As far as I know, pnpm created multiple hardlinks because it needs to ensure the accuracy of the peer dependencies.

how-peers-are-resolved

In your example, lodash should have no peer. Maybe you should check if .pnpm/[email protected]/node_modules/lodash and .pnpm/[email protected]/node_modules/lodash are also symlinks. They will eventually point to the same file.

In addition, simple cache should not solve the problem. And it will also destroy the accuracy of peerDependencies.

If the bar has a peer: baz@^1.0.0
if node_modules/.pnpm/xxxx/bar is replaced with node_modules/.pnpm/yyyy/bar. bar will read to baz1.1.0 instead of [email protected]

- package-a
  - [email protected] -> node_modules/.pnpm/xxxx/bar
  - [email protected]
- package-b
  - [email protected] -> node_modules/.pnpm/yyyy/bar
  - [email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants