Web Site Update

Published on 10 November 2009 by bradley in TokuView

We just updated our web site and blogs. We hope the update didn’t cause any trouble for people trying to read the blogs or download TokuDB, our MySQL storage engine. In addition to a new look, we now provide pricing as well as easier downloads.

0 comments | Continue Reading

Cache Miss Rate as a function of Cache Size

Published on 12 September 2009 by bradley in TokuView

I saw Mark Callaghan’s post, and his graph showing miss rate as a function of cache size for InnoDB running MySQL. He plots miss rate against cache size and compares it to two simple models:

A linear model where the miss rate is (1-C/D)/50, and
A inverse-proportional model where the miss rate is D/(1000C).

He seemed happy [...]

1 comment | Continue Reading

Sponsoring OpenSQL Camp 2009

Published on 11 September 2009 by bradley in TokuView

We’re supporting the OpenSQL Camp, which will be held in Portland on November 14.
One of my objectives for the camp is to make progress on a universal storage engine API, to make it possible to use the same storage engines in MySQL, PostgreSQL, Ingres, or any other database. I’m also looking forward [...]

2 comments | Continue Reading

Sorting a Terabyte in 197 seconds

Published on 17 August 2009 by bradley in TokuView

Sorting a Terabyte in 197 seconds
I just returned from The 21st ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), held in Calgary, where I gave a talk about my entry to the sorting contest. I sorted 1TB in 197s on a 400-node machine at MIT Lincoln Laboratory, a record which still stands today. [...]

0 comments | Continue Reading

Autoincrement Semantics

Published on 29 July 2009 by bradley in TokuView

In this post I’m going to talk about how TokuDB’s implementation of auto increment works, and contrast it to the behavior of MyISAM and InnoDB. We feel that the TokuDB behavior is easier to understand, more standard-compliant and offers higher performance (especially when implemented with Fractal Tree indexes).
In TokuDB, each table can have an [...]

0 comments | Continue Reading

Summary: An alternate approach, offered in response to our original post, provides excellent improvements for smaller databases, but clustered indexes offer better performance as database size increases. (This posting is by Dave.)

Jay Pipes suggested an alternate approach to improving MySQL performance of Query 17 on a TPC-H-like database.

Add the index (l_partkey, l_quantity) [...]

3 comments | Continue Reading

Improving TPC-H-like queries – Q17

Published on 15 June 2009 by bradley in TokuView

Executive Summary: A query like TPC-H Query 17 can be sped up by large factors by using straight_joins and clustering indexes. (This entry posted by Dave.)

In a previous post, we wrote about queries like TPC-H query 2, and the use of straight_join to improve performance.
This week, we consider Query 17, described by the TPC-H [...]

8 comments | Continue Reading

Long Index Keys

Published on 01 June 2009 by bradley in TokuView

In this post we’ll describe a query that accrued significant performance advantages from using a relatively long index key. (This posting is by Zardosht and Bradley.)

We ran across this query recently when interacting with a customer (who gave us permission to post this sanitized version of the story):

SELECT name,
[...]

12 comments | Continue Reading

Yesterday, I (Zardosht) posted an entry introducing clustering indexes. Here, I elaborate on three differences between a clustering index and a covering index:

Clustering indexes can create indexes that would otherwise bounce up against the limits on the maximum length and maximum number of columns in a MySQL index.
Clustering indexes simplify syntax making them easier [...]

4 comments | Continue Reading

In this posting I’ll describe TokuDB’s multiple clustering index feature. (This posting is by Zardosht.)

In general (not just for TokuDB) a clustered index or a clustering index is an index that stores the all of the data for the rows. Quoting the MySQL 5.1 reference manual:

Accessing a row through the clustered index [...]

32 comments | Continue Reading