-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleaned finder and removed pbc (getting it from Lattice) #706
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #706 +/- ##
==========================================
- Coverage 86.67% 86.64% -0.03%
==========================================
Files 399 399
Lines 50727 50739 +12
==========================================
- Hits 43966 43964 -2
- Misses 6761 6775 +14 ☔ View full report in Codecov by Sentry. |
The problem with that output is that the second atom should have If you can't pass |
You just change the bc in the contained geometry. I don't think we should complicate things here. What we really want is that the pbc of the lattice should govern everything. Having a geometry without pbc, then asking for pbc neighbors is also confusing, hence the rationale. I can't (immediately) see a workflow where changing between pbc and not would be valuable. As for the neighbor return, we need to figure out what out of cell coordinates mean, because it should essentially be the same as |
To expand on the neighbor locating stuff from outside the unit-cell. Consider these two systems: xyz1 = [
[0, 0, 0],
[3, 0, 0]
]
cell1 = cell2 = [2, 10, 10]
xyz1 = [
[0, 0, 0],
[1, 0, 0]
]
pbc1 = pbc2 = True they are effectively the same system. So we should treat it as such. However, it won't be as simple as translating to the primary unit cell. Consider this system: xyz1 = [
[0, 0, 0]
[1, 0, 0.5],
[1, 1, -0.5]
]
xyz2 = xyz1 + [0, 0, 2]
pbc = True, True, False here a translation into the UC wouldn't work. |
On your first example, I don't think both should be treated the same. There is a supercell shift in the second atom and that shift (but with opposite sign) should also be present on the neighbors. For that, we can translate to the unit cell and store the original supercell index. I don't understand the problem with the second example. |
So the example above, one should find that the first atom has 2 neighbors, at
The problem of the 2nd example is that one cannot simply translate all coordinates, and in most cases, a lattice vector of finite size without PBC would still wrap around in most systems. So in this case the user has all atoms lying at the edge of the lattice box. Then an MD would push the atom outside of the lattice box. So a translation without PBC would mean that the coordinates are not neighbors (whereas in fact they are). |
I don't agree with what you say from the first example. I think for
And in
(each neighbor is In the second example, I hadn't seen the |
Thats also what I wrote. :)
Yes, but the problem is that the coordinates typically come from a code, and those codes have a lattice vector. E.g. in Siesta for a slab, you could see the above (2nd case), but you could also have a case where:
and for Siesta, this would still amount to them being neighbors. So it is not a matter of pure translation, nor about UC-translations. |
In the first example what I didn't agree with is what you say about The example that you give about SIESTA, then |
And note, Siesta also has some problems with this, see https://gitlab.com/siesta-project/siesta/-/issues/129 |
I don't see it necessarily as a problem. These are two different situations, and the supercell shifts are different. The atom that has been displaced outside the unit cell is not the same atom as its image in the unit cell, so I think it's more consistent to keep this differentiation. The auxiliary supercell gets bigger, but the number of entries in the matrix stays the same. |
Ok, but then we need some way to alert users when they change
Well, the problem is that the integers used to hold the indices grows, and you'll much faster reach the limits (currently int32 can be hit quite fast with supercells). Also it gets difficult when users wants to cut couplings (through |
Hmm I guess, but it would probably be overkill, we could just introduce the check on the method that sets boundary conditions. Although I think there's no need to worry that much about this. It is perfectly logic that removing periodicity will mean that atoms that interacted through periodic images will no longer interact. Sure some users will be surprised the first time their code fails if they don't take this into account, but then they'll learn and will understand that the behavior is perfectly consistent. Maybe later we can provide a method that detects vacuum and moves its center to the edge of the cell, if that is useful.
Well if they want to do that they can first translate to the unit cell, that's easy to solve.
How fast is "quite fast"? I just did a quick calculation and we are talking about 150k atoms (20 orbitals each) and a 9×9x9 auxiliary cell. And the more atoms, the more unlikely it is that you need an extra supercell, because the cell will be bigger so an atom will need a really long simulation to go through it. And if it becomes a problem, you can always translate everything into the unit cell or go with int64, whatever you prefer. I think that translating to the unit cell for computing neighbors but keeping the coordinates outside of the unit cell could very easily lead to inconsistencies in different parts of the code. Also I don't think it necessarily makes things cleaner and easier to understand. Imagine a molecule which half of its atoms have moved outside the unit cell. If you want to move everything to the unit cell for computing neighbors, some intramolecular interactions will be supercell interactions. Isn't that weird? My point is that not everything is made clearer by translating to the unit cell and therefore I don't think it is worth it to risk losing consistency. You can always translate to the unit cell explicitly if you need/want it, and then there won't be any risk of inconsistencies. |
Yeah. That might be better.
Ok, I was too quick, it happens first at ~80 million orbitals.
But even your code (and mine as well) still thinks these are supercell connections, right? |
What I'm saying precisely is that we shouldn't convert anything. Internally for practical reasons of the algorithm we should use the periodic images that are inside the unit cell, but we should take that into account when we return the neighbors.
I would keep the atom outside the unit cell and take that into account when computing neighbors.
Hmm no, unless I have a bug, I try to adhere to SIESTA's way of describing connections, which I think is the most consistent one (although I admit that at the beginning it was not easy to understand). For example in the neighbor finder I didn't take this into account, so it is wrong 😅 There might be cases in which having all atoms in the unit cell is easier to manage and leads to the same result (e.g. to compute the density on the grid), but then what you can do is |
I will merge this, but it would be great if you could solve the neighbor problem in the |
The neighbor finder got an overhaul to check for PBC and friends. Signed-off-by: Nick Papior <[email protected]>
But we still didn't agree on the convention to follow 😅 |
True, but first step is to make it actually find the neighbors.
should return neighbors for both atoms, regardless of convention. I.e. it has PBC. I am also thinking about whether we should change the return value to be a tuple of indices, having everything in one array is nice. But I think it would be more easily understood by the end-user if it is returned in a tuple:
or perhaps a namedtuple so we can extend the return values? Then we should take up the discussion of the convention later (it shouldn't matter here, no?) |
The neighbor finder got an overhaul to check for PBC and friends.
pbc
arguments to routines, they now get it from thegeometry
object contained.find_all_unique_pairs
changed named tofind_unique_pairs
(all was superfluous)test_finder.py
where one is located outside the unit cell. This problem is also inherent to the current way of doing things, but I wanted to give it a try here as well. And I found the same problem.I.e. if one runs this small snippet one also gets:
output
@pfebrer you have any clue was goes on here?
Also, we need to discuss what should be done. Because the current way things are handled (in both this and the old part of the code) needs to be fixed in one way. I don't know if one should always do a translation to the UC for PB directions, and then special care should be done along non PB directions.