-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MapIterator support for Draining map content (e.g., queues,stacks,hash) #1349
base: main
Are you sure you want to change the base?
Conversation
d65df8e
to
9a2089a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for putting in the work. Though I would like to request some changes.
examples/queue/main.go
Outdated
// This program demonstrates attaching an eBPF program to a network interface | ||
// with XDP (eXpress Data Path). The program parses the IPv4 source address | ||
// from packets and pushes the address alongside the computed packet arrival timestamp | ||
// into a Queue. This is just an example and probably does not represent the most | ||
// efficient way to perform such a task. Another potential solution would be to use | ||
// an HashMap with a small __u64 arrays associated to each IPv4 address (key). | ||
// In both the two ways it is possible to lose some packet if (a) queue is not large | ||
// enough or the packet processing time is slow or (b) if the associated array is | ||
// smaller than the actual received packet from an address. | ||
// The userspace program (Go code in this file) prints the contents | ||
// of the map to stdout every second, parsing the raw structure into a human-readable | ||
// IPv4 address and Unix timestamp. | ||
// It is possible to modify the XDP program to drop or redirect packets | ||
// as well -- give it a try! | ||
// This example depends on bpf_link, available in Linux kernel version 5.7 or newer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like you mention yourself, this isn't a prime example of queues or stacks. I understand you want an example to showcase those map types. A good example is small and doesn't add to much extra info. I honestly don't think it adds much to your PR, I suggest dropping it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point and I agree.
Let me know if the ktime -> Unix time
conversion utility function could be useful somehow. I'd leave it up to the user and wouldn't insert it as part of the library.
map.go
Outdated
// For Queue/Stack map block the iteration after maxEntries | ||
// to avoid potential infinite loops | ||
// (values can be pushed to map while doing pop) | ||
if mi.count == mi.maxEntries { | ||
mi.err = fmt.Errorf("%w", ErrIterationAborted) | ||
return false | ||
} | ||
|
||
mi.count++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this check works as intended. If a user indeed uses a stack or queue as kernel to userspace mechanism, and is constantly adding new objects, then you would expect to iterate for more than the maxEntries of the map.
I think we can just leave this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. In that way, is the user that decides whether to stop after certain iterations or not.
map.go
Outdated
if mi.target.typ.isQueueStack() { | ||
return mi.nextQueueMap(valueOut) | ||
} | ||
return mi.next(keyOut, valueOut) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should implement this slightly differently. Now that we have multiple iterator implementations (and I expect more to come), it makes sense to start using an interface with multiple implementations instead.
I suggest renaming MapIterator
, and making it unexported. Then add a MapIterator
interface with the Err() error
and Next(keyOut, valueOut interface{}) bool
methods. And add your logic as an new stack/queue iterator struct.
Then make Map.Iterate
pick the correct iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is perfectly fine by me. I'm going to create two MapIterator implementations, one for key-value maps keyValueMapIterator
and one for value-only maps keylessMapIterator
. Correct me if the naming is horrible.
Concerning the PerCpu maps, I'd leave as it is, so that the old MapIterator
(hence the newly keyValueMapIterator
) handles this case.
map_test.go
Outdated
func TestMapStack(t *testing.T) { | ||
testutils.SkipOnOldKernel(t, "4.20", "map type stack") | ||
|
||
m, err := NewMap(&MapSpec{ | ||
Type: Stack, | ||
ValueSize: 4, | ||
MaxEntries: 2, | ||
}) | ||
if err != nil { | ||
t.Fatal(err) | ||
} | ||
defer m.Close() | ||
|
||
for _, v := range []uint32{42, 4242} { | ||
if err := m.Put(nil, v); err != nil { | ||
t.Fatalf("Can't put %d: %s", v, err) | ||
} | ||
} | ||
|
||
var ( | ||
v uint32 | ||
v2 uint32 | ||
) | ||
if err := m.Lookup(nil, &v); err != nil { | ||
t.Fatal("Lookup (Peek) on Stack:", err) | ||
} | ||
|
||
if err := m.Lookup(nil, &v2); err != nil { | ||
t.Fatal("Lookup (Peek) consecutive on Stack:", err) | ||
} | ||
|
||
if v != v2 { | ||
t.Fatal("Lookup (Peek) value removal from Stack:") | ||
} | ||
|
||
if v != 4242 { | ||
t.Error("Want value 4242, got", v) | ||
} | ||
v = 0 | ||
|
||
if err := m.LookupAndDelete(nil, &v); err != nil { | ||
t.Fatal("Can't lookup and delete element:", err) | ||
} | ||
if v != 4242 { | ||
t.Error("Want value 4242, got", v) | ||
} | ||
|
||
v = 0 | ||
if err := m.LookupAndDelete(nil, unsafe.Pointer(&v)); err != nil { | ||
t.Fatal("Can't lookup and delete element using unsafe.Pointer:", err) | ||
} | ||
if v != 42 { | ||
t.Error("Want value 42, got", v) | ||
} | ||
|
||
if err := m.LookupAndDelete(nil, &v); !errors.Is(err, ErrKeyNotExist) { | ||
t.Fatal("Lookup and delete on empty Stack:", err) | ||
} | ||
|
||
if err := m.Lookup(nil, &v); !errors.Is(err, ErrKeyNotExist) { | ||
t.Fatal("Lookup (Peek) on empty Stack:", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this test is more about testing how the map type is implemented than it does the library. Something that is more at home in the kernel selftests. Same goes for the additions in the queue test. I would simply not bother with these.
On the other hand, I see no tests for the new iteration code. That is what I am interested in seeing, to check that the iterator actually works and stays working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, proceeding this way 👍
I would just leave one additional lookup (peek) test on the Queue for completeness, but please let me know if that is not needed.
11c813b
to
2dfaef8
Compare
Many thanks again Dylan for the feedback! |
2dfaef8
to
23c82af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patches! I'm not completely convinced we want to make MapIterator an interface (yet).
map.go
Outdated
|
||
// Next decodes the next key and value. | ||
// | ||
// In case of a value-only map (Queue and Stack), the key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an implementation detail, not something the caller needs to take into account explicitly to correctly use the interface. Semantically, keyOut
doesn't seem different from regular kv maps, so I'd drop this paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, removing it in the next commit, thanks!
map.go
Outdated
// | ||
// The method must be called after Next returns nil. | ||
// | ||
// For key-value maps, returns ErrIterationAborted if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove the 'for key-value maps' part. The interface shouldn't document the implementation. Generally, ErrIterationAborted
means a full iteration wasn't possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, probably too many comments aren't needed in this case.
map.go
Outdated
// | ||
// Returns false if there are no more entries. You must check | ||
// the result of Err afterwards. | ||
func (mi *keylessMapIterator) next(_, valueOut interface{}) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd name this MapIterator.pop()
and make MapIterator.Next()
call next()
or pop()
depending on the outcome of isQueueStack()
.
types.go
Outdated
@@ -102,6 +102,12 @@ func (mt MapType) hasPerCPUValue() bool { | |||
return mt == PerCPUHash || mt == PerCPUArray || mt == LRUCPUHash || mt == PerCPUCGroupStorage | |||
} | |||
|
|||
// isQueueStack returns true if the Map is a Queue (BPF_MAP_TYPE_QUEUE) | |||
// or Stack (BPF_MAP_TYPE_STACK) | |||
func (mt MapType) isQueueStack() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a shortcut for the map's type instead of its semantics. Maybe needsPop()
would be more descriptive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, going to change it in the next commit!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question regarding semantics:
- What happens when you do a plain Lookup on a queue or stack? Does that function like a peek? Or does it return an error?
- Are there other map types which support LookupAndDelete? Are there ones which support passing a Key argument?
Thanks to all the reviewers for the feedback.
|
Have you had the time to look into my questions in #1349 (review) ? I think they will help us decide what the implementation looks like. |
@lmb I apologize, I missed the message.
|
Clarification, for stacks/queues:
Only hash maps(and variants: LRU, PER_CPU, LRU_PER_CPU) have support for Also interesting is that
Yes, the hashmap supports passing a key value. Actually, even before v5.14, a valid pointer must be provided to the key field. The kernel always checks it and copies it into kernel memory, just to discard it. After v5.14 it of course starts to actually get passed to the map implementation. |
Okay! That was very helpful, thanks to both of you. The peek operations only return the top of the stack / first item in the queue? And currently doing I'm currently leaning towards adding a new method |
Yes, peeking only returns the top of the stack. Any key provided will be ignored as far as I understand. But as @s41m0n mentioned the |
…ue-Stack tests + add queue example 1. MapIterator.Next would previously fail with maps of type Queue or Stack, as their Lookup method does not accept a key and acts just like a peek function 2. Add further tests to the Lookup method (peek) with Queue maps, and introduced tests for Stack maps 3. Add example of using Queue within an XDP program Signed-off-by: Simone Magnani <[email protected]>
… + tests Signed-off-by: Simone Magnani <[email protected]>
c3c539c
to
3acb798
Compare
Thanks for the info-gathering and brainstorming. I just pushed a possible implementation of what has been discussed. |
Signed-off-by: Simone Magnani <[email protected]>
3acb798
to
c620552
Compare
// | ||
// Iterating a hash map from which keys are being deleted is not | ||
// safe. You may see the same key multiple times. Iteration may | ||
// also abort with an error, see IsIterationAborted. | ||
// | ||
// Iterating a queue/stack map returns ErrIterationAborted, as only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to document this on Map.Iterate
instead?
@@ -1545,12 +1557,16 @@ func newMapIterator(target *Map) *MapIterator { | |||
} | |||
} | |||
|
|||
// Next decodes the next key and value. | |||
// Next decodes the next key and value. If the iterator is created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: do you think it's necessary to call this out again? It's already on Drain after all.
// For array-like maps NextKey returns nil only after maxEntries iterations. | ||
// For maps with keySize equal to 0 (Queue/Stack) we return ErrIterationAborted, | ||
// since NextKey returns an error. In this case, Map.Drain should be used instead. | ||
for mi.target.keySize != 0 && mi.count <= mi.maxEntries { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit unsure about using keySize like this, it's a very roundabout way of referring to queue / stack. What if there are maps in the future without a key size?
My personal preference would be to drop the special case and make sure that whatever error we get from Queue.Drain().Next()
due to next key failing is sensible.
// is not supported (e.g., kernel < 5.14). | ||
if mi.fallback { | ||
mi.count++ | ||
mi.err = mi.target.Lookup(mi.cursor, valueOut) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't have the same semantics as LookupAndDelete, since that guarantees that a key / value combo is only returned once if there are concurrent traversal. Those semantics seem useful to me. Would you mind just dropping the fallback?
// - insert key `d` in map | ||
// - retrieve all the remaining keys `d -> b -> c` | ||
mi.err = mi.target.NextKey(nil, mi.cursor) | ||
if errors.Is(mi.err, ErrKeyNotExist) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What error does a NextKey on a queue or stack return?
|
||
if errors.Is(mi.err, ErrKeyNotExist) { | ||
// Same as in MapIterator.nextIterate. | ||
return mi.nextDrain(keyOut, valueOut) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this recurse? To avoid the for loop? I think the for loop is easier to understand here because one just has to look for a break or return statement to figure out how the loop exits. For me that is more difficult with recursive functions.
keySize := uint32(4) | ||
|
||
if mapType == Queue { | ||
testutils.SkipOnOldKernel(t, "4.20", "map type queue") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After dropping the fallback the whole test can just be 5.15 and above I guess?
@@ -1044,6 +1056,49 @@ func TestIterateEmptyMap(t *testing.T) { | |||
} | |||
} | |||
|
|||
func TestDrainEmptyMap(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test necessary? You could just try another Drain at the end of the other tests below.
} | ||
}) | ||
|
||
t.Run(Queue.String(), func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you take a look at how we use qt.Assert
? Especially qt.DeepEquals
for comparing slices. I think that'll let you shorten the tests a bit.
Dear community, with this PR I would like to introduce the following contributions.
I'll be available for further changes/updates.
MapIterator support for Queue/Stack maps
Ther
MapIterator.Next
would normally fail with maps of type Queue or Stack for two reasons:With that in mind, instead of adding further workarounds and control to each individual call to the previous
MapIterator.Next
function or to eachMap.Lookup
(potentially a non-negligible overhead), I'd propose splitting it into two internal methods. In this case, the program logic of calling all the other functions (e.g., Lookup, BatchLookup that is not supported) to a Queue/Stack is preserved, throwing the expected error.The proposed solution preserves a maximum amount of retrieved items, to avoid an infinite loop as the Queue/Stack may be continuously populated while values are retrieved.
Extended tests for Queue/Stack maps
Example of using a Queue map
I'd introduce an example that demonstrates the usage of a Queue map within an XDP program. The program parses IPv4 packets, retrieving the source address and pushing it, alongside a computed
ktime_ns
into the Queue. In the userspace program, the Queue is periodically emptied, formatting its content into a human-readable string containing the IPv4 address and the timestamp into a goTime
struct.Further Proposal
Would it be useful to introduce a function to parse a
ktime_ns
value intoTime
struct as a utility function of this library? In case, please don't hesitate to point me to the right file where this change could be performed and I can do it. I'd use a similar method that retrieves the boot uptime time once (to prevent continuous syscalls) and simply return it incremented by thektime_ns
argument value.