#DBHangOps!

Talk shop and learn about MySQL and occasionally some other stores!

Topics and RecordingsLearn From Previous Discussions!

How to JoinSome simple setup help!

Follow Us On

Twitter

#DBHangOps 10/30/14 -- New in TokuDB, Outbrain's Orchestrator, and more!

Hello everybody!

Join in #DBHangOps this Thursday, October, 30, 2014 at 11:00am pacific (18:00 GMT), to participate in the discussion about:

  • What's new in TokuDB 7.5
  • Outbrain's Orchestrator - how are other people using it?
  • Mixing data and metadata

You can check out the event page at https://plus.google.com/events/c133qbpofubhliavlou8iie4qic on Thursday to participate.

As always, you can still watch the #DBHangOps twitter search, the @DBHangOps twitter feed, or this blog post to get a link for the google hangout on Thursday!

See all of you on Thursday!

You can catch a livestream at:

Show Notes

What's new in TokuDB 7.5

  • What is TokuDB?
  • Indexing 101!
    • B-Trees
      • InnoDB and MyISAM use BTree indexes. It typically looks like a pyramid with "Pivots" and points that can get you to the sorted leaf nodes.
        • This allows you to quickly get to the data you're looking for.
      • Imagine the data in the BTree index being simply Key=>Value pairs.
      • Once you get to the leaf nodes, a binary search will happen, but everything is already sorted so it's a quick scan
      • BTrees were invented in the early 1970s. A good data structure, but we're starting to push it's scale limits
        • When the BTree doesn't wholly fit in memory, you'll start to see "problems".
      • One IO is needed for every insert/update (actually it's one IO for every index on the table)
    • Fractal trees
      • similar to B-Trees. The 2 big differences are:
        • All internal nodes have message buffers
        • Big nodes (4MB vs. ~16KB)
      • You can define the node size at a table level
      • thanks to the message buffers, you can sometimes serve reads before traversing farther down the index. Eventually changes cascade down into the leaf nodes.
      • Updates are effectively a "Delete then Insert". So only inserts/deletes were needed initially
      • New message types allow for greater flexibility too.
        • add_column() message means a new field can be kep in the message buffer before persisting down in the leaf nodes.
    • What about the InnoDB Change buffer?
      • The primary key index doesn't have a buffer
      • Each secondary index only has one buffer (and it can grow pretty large)
      • The bigger the dataset, the slower your insertions will get over time
  • MySQL Replication 101
    • Modes and logging:
      • Statement Based - statement that's run on a master is written verbatim into the replication log
      • Row Based - inserts and deletes simply write a PK and the operation. Updates write a copy of the row before the update and the changed version of the row
      • Mixed - defaults to statement unless MySQL detects an "unsafe" statement (e.g. a non-deterministic UPDATE statement)
    • Read only slaves
      • typically you want to set "read_only=1". This means only replication or users with the SUPER privilege can change data
        • "Read-free Replication" in TokuDB will break if a statement is inserted by a SUPER user. Be warned!
    • Slave lag
      • in MySQL 5.5, replication is single-threaded. The master is concurrent, but the slave will not be.
      • MySQL 5.6 and MariaDB 10.0 have parallel replication methods
      • Read Free Replication in TokuDB benefits from these too!
  • TokuDB Read Free Replication
    • Binlog format must be set to ROW
    • Slave must have the following set:
      • read_only=1
      • tokudb_rpl_unique_checks=0 and/or tokudb_rpl_lookup_rows=0
    • Skipping unique checks (tokudb_rpl_unique_checks)
      • Master's already done the uniqueness check, so the slave can probably skip this
        • Since InnoDB doesn't support change buffering on the primary key, it needs to do this uniqueness check for maintenance
    • Skipping Read/Write/Modify behavior (tokudb_rpl_lookup_rows)
      • If RBR is enabled, master is already providing before/after images of a row
      • Everything that's needed for the change is already in the binary log
      • TokuDB can do a simple message injection
    • Enabling Read Free Replication can help lagging slaves keep up. It also opens up a significant percentage of read capacity.
    • When enabling Read Free Replication , most reads on a slave that would be needed for replication effectively go away.
    • By opening up read capacity, you could potentially co-locate more than one mysqld for different mysql clusters to save on hardware

Outbrain Orchestrator