Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to connect to database: Access denied for user 'mogilefs #1

Open
wants to merge 418 commits into
base: master
Choose a base branch
from

Conversation

molele2
Copy link

@molele2 molele2 commented Sep 7, 2016

I try this:
1 step:

on mysql database:

mysql> create database mogilefs;
Query OK, 1 row affected (0.00 sec)

mysql> grant all on mogilefs.* to 'mogilefs'@'%' identified by 'mogilefs';
Query OK, 0 rows affected (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

2 step:

mogdbsetup --dbhost=192.168.2.161 --dbport=3306 --dbname=mogilefs --dbrootuser=root --dbrootpass=198456 --dbuser=mogilefs --dbpass=mogilefs

3 step:
su mogilefs

bash-4.1$ mogilefsd -c /etc/mogilefs/mogilefsd.conf --daemon
Failed to connect to database: Access denied for user 'mogilefs #数据库上妀'192.168.2.161' (using password: YES) at /usr/local/share/perl5/MogileFS/Store.pm line 388.

mogilefsd.conf

daemonize = 1
pidfile = /var/run/mogilefsd/mogilefsd.pid
db_dsn = DBI:mysql:mogilefs:host=192.168.2.161
db_user = mogilefs
db_pass = mogilefs
listen = 192.168.2.161:7001
conf_port = 7001
query_jobs = 10
delete_jobs = 1
replicate_jobs = 5
reaper_jobs = 1

[root@bcZmmpe81nZ90s MogileFS]# ls -ld /var/run/mogilefsd
drwxr-xr-x. 2 mogilefs mogilefs 4096 Sep 6 23:42 /var/run/mogilefsd

Eric Wong added 30 commits March 29, 2012 17:11
We need to ensure we're sane when dealing with larger files
requiring multiple reads.
This matches the buffer size used by replication, and showed a
performance increase when timing the 100M large file test in
t/40-httpfile.t

With the following patch, I was able to note a ~46 -> ~27s
time difference with both MD5 methods using this change
to increase buffer sizes.

  --- a/t/40-httpfile.t
  +++ b/t/40-httpfile.t
  @@ -125,5 +125,12 @@ $expect = $expect->digest;
   @paths = $mogc->get_paths("largefile");
   $file = MogileFS::HTTPFile->at($paths[0]);
   ok($size == $file->size, "big file size match $size");
  +use Time::HiRes qw/tv_interval gettimeofday/;
  +
  +my $t0;
  +$t0 = [gettimeofday];
   ok($file->md5_mgmt(sub {}) eq $expect, "md5_mgmt on big file");
  +print "mgmt ", tv_interval($t0), "\n";
  +$t0 = [gettimeofday];
   ok($file->md5_http(sub {}) eq $expect, "md5_http on big file");
  +print "http ", tv_interval($t0), "\n";
Base64 requires further escaping for our tracker protocol which
gets ugly and confusing.  It's also easier to interact/verify
with existing command-line tools using hex.
We need a place to store mappings for various checksum
types we'll support.
This is needed to wire up checksums to classes.
Digest::MD5 and Digest::SHA1 both support the same API for
streaming data for the calculation, so we can validate our
content as we stream it.
Checksum usage will be decided on a per-class basis.
This branch is now rebased against my latest clear_cache
which allows allows much faster metadata updates for testing.
We'll use the "Digest" class in Perl as a guide for this.
Only MD5 is officially supported.

However, this *should* support SHA-(1|256|384|512) and it's easy
to add more algorithms.
We can now:
* enable checksums for classes
* save client-provided checksums to the database
* verify them on create_close
* read them in file_info
we need to be able to both enable and disable checksuming for a class
This returns undef if a checksum is missing for a class,
and a MogileFS::Checksum object if it exists.
replication now lazily generates checksums if they're not
provided by the client (but required by the storage class).

replication may also verify checksums if they're available
in the database.

replication now sets the Content-MD5 header on PUT requests,
in case the remote server is capable of rejecting corrupt
transfers based on it

replication attempts to verify the checksum of the freshly
PUT-ed file.

TODO: monitor will attempt "test-write" with mangled Content-MD5
      to determine if storage backends are Content-MD5-capable
      so replication can avoid reading checksum on destination
This functionality (and a server capable of rejecting bad MD5s)
will allow us to skip an expensive MogileFS::HTTPFile->digest
request at replication time.

Also testing with the following patch to Perlbal:

  --- a/lib/mogdeps/Perlbal/ClientHTTP.pm
  +++ b/lib/mogdeps/Perlbal/ClientHTTP.pm
@@ -22,6 +22,7 @@ use fields ('put_in_progress', # 1 when we're currently waiting for an async job
             'content_length',  # length of document being transferred
             'content_length_remain', # bytes remaining to be read
             'chunked_upload_state', # bool/obj:  if processing a chunked upload, Perlbal::ChunkedUploadState object, else undef
+            'md5_ctx',
             );

 use HTTP::Date ();
@@ -29,6 +30,7 @@ use File::Path;

 use Errno qw( EPIPE );
 use POSIX qw( O_CREAT O_TRUNC O_WRONLY O_RDONLY ENOENT );
+use Digest::MD5;

 # class list of directories we know exist
 our (%VerifiedDirs);
@@ -61,6 +63,7 @@ sub init {
     $self->{put_fh} = undef;
     $self->{put_pos} = 0;
     $self->{chunked_upload_state} = undef;
+    $self->{md5_ctx} = undef;
 }

 sub close {
@@ -134,6 +137,8 @@ sub handle_put {

     return $self->send_response(403) unless $self->{service}->{enable_put};

+    $self->{md5_ctx} = $hd->header('Content-MD5') ? Digest::MD5->new : undef;
+
     return if $self->handle_put_chunked;

     # they want to put something, so let's setup and wait for more reads
@@ -421,6 +426,8 @@ sub put_writeout {

     my $data = join("", map { $$_ } @{$self->{read_buf}});
     my $count = length $data;
+    my $md5_ctx = $self->{md5_ctx};
+    $md5_ctx->add($data) if $md5_ctx;

     # reset our input buffer
     $self->{read_buf}   = [];
@@ -460,6 +467,17 @@ sub put_close {

     if (CORE::close($self->{put_fh})) {
         $self->{put_fh} = undef;
+
+        my $md5_ctx = $self->{md5_ctx};
+        if ($md5_ctx) {
+            my $actual = $md5_ctx->b64digest;
+            my $expect = $self->{req_headers}->header("Content-MD5");
+            $expect =~ s/=+\s*\z//;
+            if ($actual ne $expect) {
+                return $self->send_response(400,
+                    "Content-MD5 mismatch, expected: $expect actual: $actual");
+            }
+        }
         return $self->send_response(200);
     } else {
         return $self->system_error("Error saving file", "error in close: $!");
Rereading a large file is expensive.  If we can monitor
and observe our storage nodes for MD5 rejectionability, we
can rely on that instead of having to have anybody reread
the entire file to calculate its MD5.
Only the fsck part remains to be implemented... And I've never
studied/used fsck much :x
TODO: see if we can use LWP to avoid mistakes like this :x
Fsck behavior is based on existing behavior for size mismatches.
size failures take precedence, since it's much cheaper to verify
size match/mismatches than checksum mismatches.

While checksum calculations are expensive and fsck is already
parallel, so we do not parallelize checksum calculations on
a per-FID basis.
It reads more easily this way, at least to me.
I'll be testing checksum functionality on my home installation
before testing it on other installations, and I run SQLite at
home.

ref: http://www.sqlite.org/lang_altertable.html
We need to ensure the worker stays alive during MD5
generation, especially on large files that can take
many seconds to verify.
This special-cases "NONE" for no hash for our users.
We don't actually use the BLOB type anywhere, as checksums
are definitely not "L"(arge) objects.
The timeout comparison is wrong and causing ping_cb to never
fire.  This went unnoticed since I have reasonably fast disks
on my storage nodes and the <$sock> operation was able to
complete before being hit by a watchdog timeout.
Enabling this setting allows fsck to checksum all replicas on
all devices and report any corrupted copies regardless of
per-class settings.

This feature is useful for determining if enabling checksums on
certain classes is necessary and will also benefit users who
cannot afford to store checksums in the database.
MD5 is faster than SHA1, and much faster than any of the SHA2
variants.  Given the time penalty of fsck is already high with
MD5, prevent folks from shooting themselves in the foot with
extremely expensive hash algorithms.
Unlike the setting it replaces, this new setting can be used to disable
checksumming entirely, regardless of per-class options.

fsck_checksum=(class|off|MD5)

class - is the default, fsck based on per-class hashtype
off - skip all checksumming regardless of per-class setting
MD5 - same as the previous fsck_auto_checksum=MD5
Eric Wong and others added 30 commits December 15, 2014 22:45
This defines the size of the HTTP connection pool.  This affects
all workers at the moment, but is likely most interesting to the
Monitor as it affects the number of devices the monitor may
concurrently update.

This defaults to 20 (the long-existing, hard-coded value).

In the future, there may be a(n easy) way to specify this on a
a per-worker basis, but for now it affects all workers.
Blindly attempting to write to a socket before a TCP connection can be
established returns EAGAIN on Linux, but not on FreeBSD 8/9.  This
causes Danga::Socket to error out, as it won't attempt to buffer on
anything but EAGAIN on write() attempts.

Now, we buffer writes explicitly after the initial socket creation and
connect(), and only call Danga::Socket::write when we've established
writability.  This works on Linux, too, and avoids an unnecessary
syscall in most cases.

Reported-by: Alex Yakovenko <[email protected]>
Otherwise we'll end up constantly waking up when there's nothing
to write.
The timeout check may run on a socket before epoll_wait/kevent has
a chance to run, giving the application no chance for any readiness
callbacks to fire.

This prevents timeouts in the monitor if the database is slow during
synchronous UPDATE device calls (or there are just thousands of active
connections).
HTTP requests time out because we had to wait synchronously for DBI,
this is very noticeable on a high-latency connection.  So avoid
running synchronous code while asynchronous code (which is subject
to timeouts) is running..
With enough devices and high enough network latency to the DB,
we bump into the watchdog timeout of 30s easily.
Issuing many UPDATE statements slow down monitoring on high latency
connections between the monitor and DB.  Under MySQL, it is possible
to do multiple UPDATEs in a single statement using CASE/WHEN
syntax.

We limit ourselves to 10000 devices per update for now, this should
keep us comfortably under most the max_allowed_packet size of most
MySQL deployments (where the default is 1M).

A compatibility function is provided for SQLite and Postgres users.
SQLite users are not expected to run this over high-latency NFS, and
interested Postgres users should submit their own implementation.
mark_fidid_unreachable has not been used since MogileFS 2.35
commit 53528c7
("Wipe out old replication code.", r1432)
No longer used since commit ebf8a5a
("Mass nuke unused code and fix most tests") in MogileFS 2.50
"is not unique" => "UNIQUE constraint failed". String matching is lovely.
Changelog diff is:

diff --git a/CHANGES b/CHANGES
index a6b2872..441b328 100644
--- a/CHANGES
+++ b/CHANGES
@@ -1,3 +1,29 @@
+2014-12-15: Release version 2.72
+
+   * Work with DBD::SQLite's latest lock errors (dormando <[email protected]>)
+
+   * remove update_host_property (Eric Wong <[email protected]>)
+
+   * remove users of unreachable_fids table (Eric Wong <[email protected]>)
+
+   * monitor: batch MySQL device table updates (Eric Wong <[email protected]>)
+
+   * monitor: defer DB updates until all HTTP requests are done (Eric Wong <[email protected]>)
+
+   * connection/poolable: defer expiry of timed out connections (Eric Wong <[email protected]>)
+
+   * connection/poolable: disable watch_write before retrying write (Eric Wong <[email protected]>)
+
+   * connection/poolable: do not write before event_write (Eric Wong <[email protected]>)
+
+   * add conn_pool_size configuration option (Eric Wong <[email protected]>)
+
+   * enable TCP keepalives for iostat watcher sockets (Eric Wong <[email protected]>)
+
+   * host: add "readonly" state to override device "alive" state (Eric Wong <[email protected]>)
+
+   * add LICENSE file to distro (dormando <[email protected]>)
+
 2013-08-18: Release version 2.70

    * This release features a very large rewrite to the Monitor worker to run
Due to a bug the MultipleNetworks replication policy
<[email protected]>, a network split caused an
instance to explode with overreplicated files.  Since every
too_happy pruning increases failcount, it could end up taking days
due to clean up a file with far too many replicas.
The readonly host state was not enabled via mogdbsetup and
could not be used although the code supports it, making
the schema version bump to 16 a no-op.

This bumps the schema version to 17.

Add a test using mogadm to ensure the setting is changeable, as the
existing test for this state did not rely on the database.

This was also completely broken with Postgres before, as
Postgres currently offers no way to modify constraints in-place.
Constraints must be dropped and re-added instead.

Note: it seems the upgrade_add_device_* functions in Postgres.pm
are untested as well and never got used.  Perhaps they ought
to be removed entirely since those device columns predate Postgres
support.
Perl buffered IO is only reading 8K at a time (or only 4K on older
versions!) despite us requesting to read in 1MB chunks.  This wastes
syscalls and can affect TCP window scaling when MogileFS is
replicating across long fat networks (LFN).

While we're at it, this fixes a long-standing FIXME item to perform
proper timeouts when reading headers as we're forced to do sysread
instead of line-buffered I/O.

ref: https://rt.perl.org/Public/Bug/Display.html?id=126403
(and confirmed by strace-ing replication workers)
* bogomips/fix-readonly:
  enable DB upgrade for host readonly state
* bogomips/fsck-recheck:
  fsck: this avoid redundant fsck log entries
* bogomips/fsck-found-order:
  fsck: do not log FOND if note_on_device croaks
* bogomips/prune-too_happy-v3:
  replicate: reduce backoff for too_happy FIDs
* bogomips/resurrect-device:
  reaper: detect resurrection of "dead" devices
Perl 5.18 stable and later (commit a7b39f85d7caac) introduced a
warning for restarting `each` after hash modification.  While we
accounted for this undefined behavior and documented it in the
past, this may still cause maintenance problems in the future
despite our current workarounds being sufficient.

In any case, keeping idle sockets around is cheap with modern
APIs, and conn_pool_size was introduced in 2.72 to avoid
dropping idle connections at all; so _conn_drop_idle may
never be called on a properly configured tracker.

Mailing list references:

<CABJfL5jiAGC+5JzZjuW7R_NXs1DShHPGsKnjzXrPbjWOy2wi3g@mail.gmail.com>
<[email protected]>
On *BSD platforms, the accept()-ed clients inherit the
O_NONBLOCK file flag from the listen socket.

This is not true on Linux, and I noticed sockets blocking on
write() syscalls via strace.  Checking the octal 04000
(O_NONBLOCK) flag in /proc/$PID/fdinfo/$FD for client TCP
sockets confirms O_NONBLOCK was not set.

This also makes us resilient to spurious wakeups causing
event_read to get stuck, as documented in the Linux select(2)
manpage.
Make client query processing less aggressive and more fair by
only enqueueing a single worker request at a time.  Pipelined
requests in the read buffer will only be handled after
successful writes, and any incomplete writes will block further
request processing.

Furthermore, add a watchdog for clients we're writing to
expire clients which are not reading our responses.

Danga::Socket allows clients to use an infinite amount of
space for buffering, and it's possible for dead sockets
to go undetected for hours by the OS.

Use a watchdog to kick out any sockets which have made no
forward progress after two minutes.
This avoids the odd case where the first write completes, but
the second one (for 3 bytes: ".\r\n") does not complete, causing
a client to having both read and write watchability enabled
after the previous commit to stop reads when writes do not
complete.

This would not be fatal, but breaks the rule where clients
should only be reading or writing exclusively, never doing
both; as that could lead to pathological memory usage.

This also reduces client wakeups and TCP overhead with
TCP_NODELAY sockets by avoiding a small packet (".\r\n")
after the main response.
Otherwise it'll be possible to pipeline admin (!) commands
and event_read will trigger EOF before all the admin commands
are processed in read_buf.
* client-backpressure:
  client: always disable watch_read after a command
  client: use single write for admin commands
  tracker: client fairness, backpressure, and expiry
  client connection should always be nonblocking
* bogomips/replicate-nobuf:
  replicate: avoid buffered IO on reads
* bogomips/conn-pool-each:
  ConnectionPool: avoid undefined behavior for hash iteration
If DevFID::size_on_disk encounters an unreadable (dead) device
AND there are no HTTP requests pending; we must ensure
Danga::Socket runs the PostLoopCallback to check if the event
loop is complete.  Do that by scheduling another timer to
run immediately.
* fsck-timeout:
  fsck: avoid infinite wait on dead devices
Changelog diff is:

diff --git a/CHANGES b/CHANGES
index 441b328..e053851 100644
--- a/CHANGES
+++ b/CHANGES
@@ -1,3 +1,29 @@
+2018-01-18: Release version 2.73
+
+   * fsck: avoid infinite wait on dead devices (Eric Wong <[email protected]>)
+
+   * client: always disable watch_read after a command (Eric Wong <[email protected]>)
+
+   * client: use single write for admin commands (Eric Wong <[email protected]>)
+
+   * tracker: client fairness, backpressure, and expiry (Eric Wong <[email protected]>)
+
+   * client connection should always be nonblocking (Eric Wong <[email protected]>)
+
+   * ConnectionPool: avoid undefined behavior for hash iteration (Eric Wong <[email protected]>)
+
+   * replicate: avoid buffered IO on reads (Eric Wong <[email protected]>)
+
+   * enable DB upgrade for host readonly state (Eric Wong <[email protected]>)
+
+   * replicate: reduce backoff for too_happy FIDs (Eric Wong <[email protected]>)
+
+   * fsck: this avoid redundant fsck log entries (Eric Wong <[email protected]>)
+
+   * fsck: do not log FOND if note_on_device croaks (Eric Wong <[email protected]>)
+
+   * reaper: detect resurrection of "dead" devices (Eric Wong <[email protected]>)
+
 2014-12-15: Release version 2.72

    * Work with DBD::SQLite's latest lock errors (dormando <[email protected]>)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants