#DBHangOps 12/11/13 -- Backups, Sharding, and more!
Check out the recording below!
Participate in #DBHangOps this Wednesday, December, 11, 2013 at 12:00pm pacific (19:00 GMT), to discuss:
- How do you do them? How often?
- How do you test your backups?
- What backup tools would you suggest?
- How do you deal with sharded database architectures?
- What tools do you use for managing a sharded environment?
- How do you configure your application to talk to MySQL?
- Do you use VIPs, DNS CNAMEs, hard-coded strings in a config file, or something else?
Load Balancers and MySQL
What load balancers have you use with MySQL? Why?
Be sure to check out the #DBHangOps twitter search, the @DBHangOps twitter feed, or this blog post to get a link for the google hangout on Wednesday!
See all of you on Wednesday!
How do you do them?
- LVM Snapshots
- You should typically target doing this on slaves
- Where do you put your snapshots once they're complete?
- Have a dedicated backup slave that maintains more than 1 open snapshot so you can quickly rollback
- Simply mounting the snapshots is costly, but easy for rollback.
- Before taking the snapshots, you need to "FLUSH TABLES"
- Percona Xtrabackup
- requires innobackupex to collect all MyISAM tables cleanly
- Performing incrementals is pretty painless compared to other methods
- Cool features:
- throttling option to reduce IOPS on server
- Streaming backups on the fly
- Recovering individual tables?
- Doing off a physical backup gets interesting. You need to ensure the data dictionary of InnoDB syncs up with what's in the .ibd file
- Chris' blog has good innodb recovery information.
- It's very important to have alerting on your backups!
- Alert if a backup fails
- Alert if a backup is not complete, restorable, etc.
- Long term storage
- Amazon S3
- Internal filer machines
- Zmanda Recovery Manager
- good ol' shell scripts and cron!
How do you deal with sharded architectures
- Horizontally or vertically?
- Vertically is breaking up logical parts of the application into different databases
- Horizontally is split the same functional data across multiple hosts
Horizontal Sharding considerations
- Need some way to map shards to physical servers
- simpler sharding method
- less flexibility for moving shards around
- Rehashing requires a full recopy of your dataset (expensive at scale)
- Have a more dynamic system that simply stores a shard ID for a given sharded item
- Slightly more complex and a little increased latency due to additional lookup
- greater flexibilty for easily moving sharded data around since you can copy the data and update your mapping data store
What do you use?
- Hard-coded in a file that you push to all machines
- DNS CNAMEs
- Setup DNS CNAMEs to reference certain servers, cluters, and roles
- e.g. have a CNAME for "mysql-master" local to each datacenter, but they all reference the master in a single DC
- DNS can be responsible for providing local slave or closest slave to the application
- Changing masters is as easy as pushing a DNS change
- Service discovery-based systems
- SkyDNS can be used for an auto-discovery based DNS system.
- SmartStack allows for auto service discovery and registration