Skip to content

Commit

Permalink
i#7031: modify marker value filter (#7033)
Browse files Browse the repository at this point in the history
In some cases we want to modify the value of certain types of markers
(TRACE_MARKER_TYPE_) in the trace.
For example, we might want to avoid exposing the "as traced" CPU
schedule in
an offline trace because it might not be representative of the native
execution
of the traced program.
Hence, we want to set the value of all TRACE_MARKER_TYPE_CPU_ID in the
trace to unknown (i.e., (uintptr_t)-1 as documented).

To do so we implement `modify_marker_value_filter_t`, a new filter that
is used with
`record_filter` as follows:
```
drrun -t drmemtrace -tool record_filter -filter_modify_marker_value 3,0xffffffffffffffff,18,2048 \
-indir path/to/input/trace -outdir path/to/output/trace
```
Here we set the value of TRACE_MARKER_TYPE_CPU_ID == 3 markers to
(uintptr_t)-1 == 0xffffffffffffffff, which representes an unknown CPU,
and
TRACE_MARKER_TYPE_PAGE_SIZE == 18 markers to 0x800 == 2048 == 2k pages.

In general, this filter takes a list of pairs
<TRACE_MARKER_TYPE_,new_value>
and modifies the value of all listed TRACE_MARKER_TYPE_ markers in the
trace
with new_value.

We add a unit test `test_modify_marker_value_filter()` and an end-to-end
test
`code_api|tool.record_filter_modify_marker_value` which invokes the
`invariant_checker` and `view` tool on the filtered trace.

Fixes #7031
  • Loading branch information
edeiana authored Oct 16, 2024
1 parent 36e7538 commit 5f74563
Show file tree
Hide file tree
Showing 14 changed files with 275 additions and 8 deletions.
4 changes: 4 additions & 0 deletions api/docs/release.dox
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,10 @@ Further non-compatibility-affecting changes include:
- Added -trace_instr_intervals_file option to the drmemtrace trace analysis tools
framework. The file must be in CSV format containing a <start,duration> tracing
interval per line where start and duration are expressed in number of instructions.
- Added modify_marker_value_filter_t to #dynamorio::drmemtrace::record_filter_t to modify
the value of TRACE_MARKER_TYPE_ markers. This filter takes a list of
<TRACE_MARKER_TYPE_,new_value> and changes every listed marker in the trace to its
corresponding new_value.

**************************************************
<hr>
Expand Down
1 change: 1 addition & 0 deletions clients/drcachesim/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,7 @@ add_exported_library(drmemtrace_record_filter STATIC
tools/filter/type_filter.h
tools/filter/encodings2regdeps_filter.h
tools/filter/func_id_filter.h
tools/filter/modify_marker_value_filter.h
tools/filter/null_filter.h)
target_link_libraries(drmemtrace_record_filter drmemtrace_simulator
drmemtrace_schedule_file)
Expand Down
3 changes: 2 additions & 1 deletion clients/drcachesim/analyzer_multi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,8 @@ record_analyzer_multi_t::create_analysis_tool_from_options(const std::string &to
op_filter_cache_size.get_value(), op_filter_trace_types.get_value(),
op_filter_marker_types.get_value(), op_trim_before_timestamp.get_value(),
op_trim_after_timestamp.get_value(), op_encodings2regdeps.get_value(),
op_filter_func_ids.get_value(), op_verbose.get_value());
op_filter_func_ids.get_value(), op_modify_marker_value.get_value(),
op_verbose.get_value());
}
ERRMSG("Usage error: unsupported record analyzer type \"%s\". Only " RECORD_FILTER
" is supported.\n",
Expand Down
9 changes: 9 additions & 0 deletions clients/drcachesim/common/options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1112,6 +1112,15 @@ droption_t<std::string>
"for the listed function IDs and removes those belonging to "
"unlisted function IDs.");

droption_t<std::string> op_modify_marker_value(
DROPTION_SCOPE_FRONTEND, "filter_modify_marker_value", "",
"Comma-separated pairs of integers representing <TRACE_MARKER_TYPE_, new_value>.",
"This option is for -tool " RECORD_FILTER ". It modifies the value of all listed "
"TRACE_MARKER_TYPE_ markers in the trace with their corresponding new_value. "
"The list must have an even size. Example: -filter_modify_marker_value 3,24,18,2048 "
"sets all TRACE_MARKER_TYPE_CPU_ID == 3 in the trace to core 24 and "
"TRACE_MARKER_TYPE_PAGE_SIZE == 18 to 2k.");

droption_t<uint64_t> op_trim_before_timestamp(
DROPTION_SCOPE_ALL, "trim_before_timestamp", 0, 0,
(std::numeric_limits<uint64_t>::max)(),
Expand Down
1 change: 1 addition & 0 deletions clients/drcachesim/common/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,7 @@ extern dynamorio::droption::droption_t<std::string> op_filter_trace_types;
extern dynamorio::droption::droption_t<std::string> op_filter_marker_types;
extern dynamorio::droption::droption_t<bool> op_encodings2regdeps;
extern dynamorio::droption::droption_t<std::string> op_filter_func_ids;
extern dynamorio::droption::droption_t<std::string> op_modify_marker_value;
extern dynamorio::droption::droption_t<uint64_t> op_trim_before_timestamp;
extern dynamorio::droption::droption_t<uint64_t> op_trim_after_timestamp;
extern dynamorio::droption::droption_t<bool> op_abort_on_invariant_error;
Expand Down
3 changes: 3 additions & 0 deletions clients/drcachesim/common/trace_entry.h
Original file line number Diff line number Diff line change
Expand Up @@ -699,6 +699,9 @@ typedef enum {
// Values below here are available for users to use for custom markers.
} trace_marker_type_t;

// As documented in TRACE_MARKER_TYPE_CPU_ID, this value indicates an unknown CPU.
#define INVALID_CPU_MARKER_VALUE static_cast<uintptr_t>(-1)

/** Constants related to function or system call parameter tracing. */
enum class func_trace_t : uint64_t { // VS2019 won't infer 64-bit with "enum {".
/**
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Estimation of pi is 3.142425985001098

Trace invariant checks passed

Output .* entries from .* entries.

Output format:

<--record#-> <--instr#->: <---tid---> <record details>

------------------------------------------------------------

1 0: +[0-9]+ <marker: version [0-9]>
2 0: +[0-9]+ <marker: filetype 0x[0-9a-f]*>
3 0: +[0-9]+ <marker: cache line size [0-9]*>
4 0: +[0-9]+ <marker: chunk instruction count [0-9]*>
5 0: +[0-9]+ <marker: page size 2048>
6 0: +[0-9]+ <marker: timestamp [0-9]*>
7 0: +[0-9]+ <marker: tid [0-9]* on core unknown>
.*
77 changes: 76 additions & 1 deletion clients/drcachesim/tests/record_filter_unit_tests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
#include "tools/filter/type_filter.h"
#include "tools/filter/encodings2regdeps_filter.h"
#include "tools/filter/func_id_filter.h"
#include "tools/filter/modify_marker_value_filter.h"
#include "trace_entry.h"
#include "zipfile_ostream.h"

Expand Down Expand Up @@ -600,6 +601,80 @@ test_func_id_filter()
return true;
}

static bool
test_modify_marker_value_filter()
{
constexpr addr_t PC = 0x7f6fdd3ec360;
constexpr addr_t ENCODING = 0xe78948;
constexpr uint64_t NEW_PAGE_SIZE_MARKER_VALUE = 0x800; // 2k pages.
std::vector<test_case_t> entries = {
// Trace shard header.
{ { TRACE_TYPE_HEADER, 0, { 0x1 } }, true, { true } },
{ { TRACE_TYPE_MARKER, TRACE_MARKER_TYPE_VERSION, { 0x2 } }, true, { true } },
{ { TRACE_TYPE_MARKER,
TRACE_MARKER_TYPE_FILETYPE,
{ OFFLINE_FILE_TYPE_ARCH_X86_64 | OFFLINE_FILE_TYPE_ENCODINGS |
OFFLINE_FILE_TYPE_SYSCALL_NUMBERS | OFFLINE_FILE_TYPE_BLOCKING_SYSCALLS } },
true,
{ true } },
{ { TRACE_TYPE_THREAD, 0, { 0x4 } }, true, { true } },
{ { TRACE_TYPE_PID, 0, { 0x5 } }, true, { true } },
{ { TRACE_TYPE_MARKER, TRACE_MARKER_TYPE_CACHE_LINE_SIZE, { 0x6 } },
true,
{ true } },
{ { TRACE_TYPE_MARKER, TRACE_MARKER_TYPE_PAGE_SIZE, { 0x1000 } }, // 4k pages.
true,
{ false } },
// Overwrite the value of TRACE_MARKER_TYPE_PAGE_SIZE with 0x800 == 2048 == 2k
// page size.
{ { TRACE_TYPE_MARKER,
TRACE_MARKER_TYPE_PAGE_SIZE,
{ NEW_PAGE_SIZE_MARKER_VALUE } },
false,
{ true } },
{ { TRACE_TYPE_MARKER, TRACE_MARKER_TYPE_TIMESTAMP, { 0x7 } }, true, { true } },
{ { TRACE_TYPE_MARKER, TRACE_MARKER_TYPE_CPU_ID, { 0x8 } }, true, { false } },
// Overwrite the value of TRACE_MARKER_TYPE_CPU_ID with ((uintptr_t)-1).
{ { TRACE_TYPE_MARKER, TRACE_MARKER_TYPE_CPU_ID, { INVALID_CPU_MARKER_VALUE } },
false,
{ true } },
// We need at least one instruction with encodings to make record_filter output
// the trace.
{ { TRACE_TYPE_ENCODING, 3, { ENCODING } }, true, { true } },
{ { TRACE_TYPE_INSTR, 3, { PC } }, true, { true } },

{ { TRACE_TYPE_FOOTER, 0, { 0x0 } }, true, { true } },
};

// Construct modify_marker_value_filter_t. We change TRACE_MARKER_TYPE_CPU_ID values
// with INVALID_CPU_MARKER_VALUE == ((uintptr_t)-1) and TRACE_MARKER_TYPE_PAGE_SIZE
// with 2k.
std::vector<uint64_t> modify_marker_value_pairs_list = { TRACE_MARKER_TYPE_CPU_ID,
INVALID_CPU_MARKER_VALUE,
TRACE_MARKER_TYPE_PAGE_SIZE,
NEW_PAGE_SIZE_MARKER_VALUE };
std::vector<std::unique_ptr<record_filter_func_t>> filters;
auto modify_marker_value_filter = std::unique_ptr<record_filter_func_t>(
new dynamorio::drmemtrace::modify_marker_value_filter_t(
modify_marker_value_pairs_list));
if (!modify_marker_value_filter->get_error_string().empty()) {
fprintf(stderr, "Couldn't construct a modify_marker_value_filter %s",
modify_marker_value_filter->get_error_string().c_str());
return false;
}
filters.push_back(std::move(modify_marker_value_filter));

// Construct record_filter_t.
test_record_filter_t record_filter(std::move(filters), 0, /*write_archive=*/true);

// Run the test.
if (!process_entries_and_check_result(&record_filter, entries, 0))
return false;

fprintf(stderr, "test_modify_marker_value_filter passed\n");
return true;
}

static bool
test_cache_and_type_filter()
{
Expand Down Expand Up @@ -1450,7 +1525,7 @@ test_main(int argc, const char *argv[])
dr_standalone_init();
if (!test_cache_and_type_filter() || !test_chunk_update() || !test_trim_filter() ||
!test_null_filter() || !test_wait_filter() || !test_encodings2regdeps_filter() ||
!test_func_id_filter())
!test_func_id_filter() || !test_modify_marker_value_filter())
return 1;
fprintf(stderr, "All done!\n");
dr_standalone_exit();
Expand Down
111 changes: 111 additions & 0 deletions clients/drcachesim/tools/filter/modify_marker_value_filter.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
/* **********************************************************
* Copyright (c) 2024 Google, Inc. All rights reserved.
* **********************************************************/

/*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* * Neither the name of Google, Inc. nor the names of its contributors may be
* used to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL VMWARE, INC. OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
* DAMAGE.
*/

#ifndef _MODIFY_MARKER_VALUE_FILTER_H_
#define _MODIFY_MARKER_VALUE_FILTER_H_ 1

#include "record_filter.h"
#include "trace_entry.h"

#include <cstring>
#include <unordered_map>

namespace dynamorio {
namespace drmemtrace {

/* This filter takes a list of <TRACE_MARKER_TYPE_,new_value> pairs and modifies the value
* of all listed markers in the trace with the given new_value.
*/
class modify_marker_value_filter_t : public record_filter_t::record_filter_func_t {
public:
modify_marker_value_filter_t(std::vector<uint64_t> modify_marker_value_pairs_list)
{
size_t list_size = modify_marker_value_pairs_list.size();
if (list_size == 0) {
error_string_ = "List of <TRACE_MARKER_TYPE_,new_value> pairs is empty.";
} else if (list_size % 2 != 0) {
error_string_ = "List of <TRACE_MARKER_TYPE_,new_value> pairs is missing "
"part of a pair as its size is not even";
} else {
for (size_t i = 0; i < list_size; i += 2) {
trace_marker_type_t marker_type =
static_cast<trace_marker_type_t>(modify_marker_value_pairs_list[i]);
uint64_t new_value = modify_marker_value_pairs_list[i + 1];
// We ignore duplicate pairs and use the last pair in the list.
marker_to_value_map_[marker_type] = new_value;
}
}
}

void *
parallel_shard_init(memtrace_stream_t *shard_stream,
bool partial_trace_filter) override
{
return nullptr;
}

bool
parallel_shard_filter(
trace_entry_t &entry, void *shard_data,
record_filter_t::record_filter_info_t &record_filter_info) override
{
trace_type_t entry_type = static_cast<trace_type_t>(entry.type);
// Output any trace_entry_t that's not a marker.
if (entry_type != TRACE_TYPE_MARKER)
return true;

// Check if the TRACE_TYPE_MARKER_ is in the list of markers for which we want to
// overwrite their value. If not, output the marker unchanged.
trace_marker_type_t marker_type = static_cast<trace_marker_type_t>(entry.size);
const auto &it = marker_to_value_map_.find(marker_type);
if (it == marker_to_value_map_.end())
return true;

// Overwrite marker value.
entry.addr = static_cast<addr_t>(it->second);

return true;
}

bool
parallel_shard_exit(void *shard_data) override
{
return true;
}

private:
std::unordered_map<trace_marker_type_t, uint64_t> marker_to_value_map_;
};

} // namespace drmemtrace
} // namespace dynamorio
#endif /* _MODIFY_MARKER_VALUE_FILTER_H_ */
15 changes: 13 additions & 2 deletions clients/drcachesim/tools/filter/record_filter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
#include "type_filter.h"
#include "encodings2regdeps_filter.h"
#include "func_id_filter.h"
#include "modify_marker_value_filter.h"

#undef VPRINT
#ifdef DEBUG
Expand Down Expand Up @@ -95,7 +96,9 @@ parse_string(const std::string &s, char sep = ',')
std::vector<T> vec;
do {
pos = s.find(sep, at);
unsigned long long parsed_number = std::stoull(s.substr(at, pos));
// base = 0 allows to handle both decimal and hex numbers.
unsigned long long parsed_number =
std::stoull(s.substr(at, pos), nullptr, /*base = */ 0);
// XXX: parsed_number may be truncated if T is not large enough.
// We could check that parsed_number is within the limits of T using
// std::numeric_limits<>::min()/max(), but this returns 0 on T that are enums,
Expand All @@ -119,7 +122,7 @@ record_filter_tool_create(const std::string &output_dir, uint64_t stop_timestamp
const std::string &remove_marker_types,
uint64_t trim_before_timestamp, uint64_t trim_after_timestamp,
bool encodings2regdeps, const std::string &keep_func_ids,
unsigned int verbose)
const std::string &modify_marker_value, unsigned int verbose)
{
std::vector<
std::unique_ptr<dynamorio::drmemtrace::record_filter_t::record_filter_func_t>>
Expand Down Expand Up @@ -160,6 +163,14 @@ record_filter_tool_create(const std::string &output_dir, uint64_t stop_timestamp
std::unique_ptr<dynamorio::drmemtrace::record_filter_t::record_filter_func_t>(
new dynamorio::drmemtrace::func_id_filter_t(keep_func_ids_list)));
}
if (!modify_marker_value.empty()) {
std::vector<uint64_t> modify_marker_value_pairs_list =
parse_string<uint64_t>(modify_marker_value);
filter_funcs.emplace_back(
std::unique_ptr<dynamorio::drmemtrace::record_filter_t::record_filter_func_t>(
new dynamorio::drmemtrace::modify_marker_value_filter_t(
modify_marker_value_pairs_list)));
}

// TODO i#5675: Add other filters.

Expand Down
5 changes: 4 additions & 1 deletion clients/drcachesim/tools/filter/record_filter_create.h
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,9 @@ namespace drmemtrace {
* @param[in] keep_func_ids A comma-separated list of integers representing the
* function IDs related to #TRACE_MARKER_TYPE_FUNC_ID (and _ARG, _RETVAL, _RETADDR)
* markers to preserve in the trace, while removing all other function markers.
* @param[in] modify_marker_value A list of comma-separated pairs of integers representing
* <TRACE_MARKER_TYPE_, new_value> to modify the value of all listed TRACE_MARKER_TYPE_
* in the trace with their corresponding new_value.
* @param[in] verbose Verbosity level for notifications.
*/
record_analysis_tool_t *
Expand All @@ -75,7 +78,7 @@ record_filter_tool_create(const std::string &output_dir, uint64_t stop_timestamp
const std::string &remove_marker_types,
uint64_t trim_before_timestamp, uint64_t trim_after_timestamp,
bool encodings2regdeps, const std::string &keep_func_ids,
unsigned int verbose);
const std::string &modify_marker_value, unsigned int verbose);

} // namespace drmemtrace
} // namespace dynamorio
Expand Down
13 changes: 12 additions & 1 deletion clients/drcachesim/tools/record_filter_launcher.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,16 @@ droption_t<std::string>
"TRACE_MARKER_TYPE_FUNC_[ID | ARG | RETVAL | RETADDR] "
"markers for the listed function IDs and removed those "
"belonging to unlisted function IDs.");

droption_t<std::string> op_modify_marker_value(
DROPTION_SCOPE_FRONTEND, "filter_modify_marker_value", "",
"Comma-separated pairs of integers representing <TRACE_MARKER_TYPE_, new_value>.",
"This option is for -tool record_filter. It modifies the value of all listed "
"TRACE_MARKER_TYPE_ markers in the trace with their corresponding new_value. "
"The list must have an even size. Example: -filter_modify_marker_value 3,24,18,2048 "
"sets all TRACE_MARKER_TYPE_CPU_ID == 3 in the trace to core 24 and "
"TRACE_MARKER_TYPE_PAGE_SIZE == 18 to 2k.");

} // namespace

int
Expand Down Expand Up @@ -168,7 +178,8 @@ _tmain(int argc, const TCHAR *targv[])
op_cache_filter_size.get_value(), op_remove_trace_types.get_value(),
op_remove_marker_types.get_value(), op_trim_before_timestamp.get_value(),
op_trim_after_timestamp.get_value(), op_encodings2regdeps.get_value(),
op_filter_func_ids.get_value(), op_verbose.get_value()));
op_filter_func_ids.get_value(), op_modify_marker_value.get_value(),
op_verbose.get_value()));
std::vector<record_analysis_tool_t *> tools;
tools.push_back(record_filter.get());

Expand Down
9 changes: 7 additions & 2 deletions clients/drcachesim/tools/view.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -335,8 +335,13 @@ view_t::parallel_shard_memref(void *shard_data, const memref_t &memref)
// see a cpuid marker on a thread switch. To avoid that assumption
// we would want to track the prior tid and print out a thread switch
// message whenever it changes.
std::cerr << "<marker: tid " << memref.marker.tid << " on core "
<< memref.marker.marker_value << ">\n";
if (memref.marker.marker_value == INVALID_CPU_MARKER_VALUE) {
std::cerr << "<marker: tid " << memref.marker.tid
<< " on core unknown>\n";
} else {
std::cerr << "<marker: tid " << memref.marker.tid << " on core "
<< memref.marker.marker_value << ">\n";
}
break;
case TRACE_MARKER_TYPE_KERNEL_EVENT:
if (trace_version_ <= TRACE_ENTRY_VERSION_NO_KERNEL_PC) {
Expand Down
Loading

0 comments on commit 5f74563

Please sign in to comment.