Summary: An alternate approach, offered in response to our original post, provides excellent improvements for smaller databases, but clustered indexes offer better performance as database size increases. (This posting is by Dave.)

Jay Pipes suggested an alternate approach to improving MySQL performance of Query 17 on a TPC-H-like database.

Add the index (l_partkey, l_quantity) [...]

3 comments | Continue Reading

A couple of weeks ago, Baron Schwartz wrote an interesting post describing a rule of thumb he sometimes uses to choose the order of columns in an index. In a nutshell, he recommends putting highly selective columns first. This is a very good rule of thumb.

I would like to add another rule of thumb: columns [...]

1 comment | Continue Reading

Improving TPC-H-like queries – Q17

Published on 15 June 2009 by bradley in TokuView

Executive Summary: A query like TPC-H Query 17 can be sped up by large factors by using straight_joins and clustering indexes. (This entry posted by Dave.)

In a previous post, we wrote about queries like TPC-H query 2, and the use of straight_join to improve performance.
This week, we consider Query 17, described by the TPC-H [...]

8 comments | Continue Reading

This post is for storage engine developers that may be interested in implementing multiple clustering keys.

After blogging about TokuDB’s multiple clustering indexes feature, Baron Schwartz suggested we contribute the patch to allow other storage engine to implement the feature. We filed a feature request to MySQL to support this, along with a proposed patch. The [...]

0 comments | Continue Reading

I recently posted a blog entry on clustering indexes, which are good for speeding up queries. Eric Day brought up the concern that clustering indexes might degrade update performance. This is often true, since any update will require updating the clustering index as well.
However, there are some cases in TokuDB for MySQL, where the [...]

0 comments | Continue Reading

Long Index Keys

Published on 01 June 2009 by bradley in TokuView

In this post we’ll describe a query that accrued significant performance advantages from using a relatively long index key. (This posting is by Zardosht and Bradley.)

We ran across this query recently when interacting with a customer (who gave us permission to post this sanitized version of the story):

SELECT name,
[...]

12 comments | Continue Reading

Yesterday, I (Zardosht) posted an entry introducing clustering indexes. Here, I elaborate on three differences between a clustering index and a covering index:

Clustering indexes can create indexes that would otherwise bounce up against the limits on the maximum length and maximum number of columns in a MySQL index.
Clustering indexes simplify syntax making them easier [...]

4 comments | Continue Reading

In this posting I’ll describe TokuDB’s multiple clustering index feature. (This posting is by Zardosht.)

In general (not just for TokuDB) a clustered index or a clustering index is an index that stores the all of the data for the rows. Quoting the MySQL 5.1 reference manual:

Accessing a row through the clustered index [...]

32 comments | Continue Reading

The TokuDB storage engine for MySQL employs Fractal Tree technology. We’ve been planning to write a white paper explaining how fractal tree indexing works, but haven’t gotten to it yet. In the mean time, here are links to some academic papers that relate to our technology.

Cache-Oblivious B-Trees by Michael A. Bender, Erik D. [...]

5 comments | Continue Reading

The talk I gave at the Percona Performance Conference at the MySQL
Users Conference in April 2009 can be found
at http://tokutek.com/presentations/kuszmaul-mysqluc-percona-09-slides.pdf.

This talk provides some examples where covering indexes help, and
then describes a performance model that can be used to understand and
predict query performance. It covers clustering indexes (which are a
kind of “universal” covering index), and [...]

0 comments | Continue Reading