In our last post, Bradley described how auto increment works in TokuDB. In this post, I explain one of our implementation’s big benefits, the ability to combine better primary keys with clustered primary keys. In working with customers, the following scenario has come up frequently. The user has data that is streamed into the table, [...]

1 comment | Continue Reading

Autoincrement Semantics

Published on 29 July 2009 by bradley in TokuView

In this post I’m going to talk about how TokuDB’s implementation of auto increment works, and contrast it to the behavior of MyISAM and InnoDB. We feel that the TokuDB behavior is easier to understand, more standard-compliant and offers higher performance (especially when implemented with Fractal Tree indexes). In TokuDB, each table can have an [...]

0 comments | Continue Reading

On April 9-10 the National Science Foundation hosted the Workshop on the Science of Power Management (SciPM 2009), where I gave an invited talk. Here I give a brief summary of my talk along with a pointer to the slides. The talk describes how MySQL with TokuDB can provide a path to more energy-efficient database [...]

0 comments | Continue Reading

Summary: An alternate approach, offered in response to our original post, provides excellent improvements for smaller databases, but clustered indexes offer better performance as database size increases. (This posting is by Dave.) Jay Pipes suggested an alternate approach to improving MySQL performance of Query 17 on a TPC-H-like database. Add the index (l_partkey, l_quantity) to [...]

3 comments | Continue Reading

A couple of weeks ago, Baron Schwartz wrote an interesting post describing a rule of thumb he sometimes uses to choose the order of columns in an index. In a nutshell, he recommends putting highly selective columns first. This is a very good rule of thumb. I would like to add another rule of thumb: [...]

1 comment | Continue Reading

Improving TPC-H-like queries – Q17

Published on 15 June 2009 by bradley in TokuView

Executive Summary: A query like TPC-H Query 17 can be sped up by large factors by using straight_joins and clustering indexes. (This entry posted by Dave.) In a previous post, we wrote about queries like TPC-H query 2, and the use of straight_join to improve performance. This week, we consider Query 17, described by the [...]

8 comments | Continue Reading

This post is for storage engine developers that may be interested in implementing multiple clustering keys. After blogging about TokuDB’s multiple clustering indexes feature, Baron Schwartz suggested we contribute the patch to allow other storage engine to implement the feature. We filed a feature request to MySQL to support this, along with a proposed patch. [...]

0 comments | Continue Reading

I recently posted a blog entry on clustering indexes, which are good for speeding up queries. Eric Day brought up the concern that clustering indexes might degrade update performance. This is often true, since any update will require updating the clustering index as well. However, there are some cases in TokuDB for MySQL, where the [...]

0 comments | Continue Reading

Long Index Keys

Published on 01 June 2009 by bradley in TokuView

In this post we’ll describe a query that accrued significant performance advantages from using a relatively long index key. (This posting is by Zardosht and Bradley.) We ran across this query recently when interacting with a customer (who gave us permission to post this sanitized version of the story): SELECT name, Count(e2) AS CountOfe2 FROM [...]

12 comments | Continue Reading

Yesterday, I (Zardosht) posted an entry introducing clustering indexes. Here, I elaborate on three differences between a clustering index and a covering index: Clustering indexes can create indexes that would otherwise bounce up against the limits on the maximum length and maximum number of columns in a MySQL index. Clustering indexes simplify syntax making them [...]

4 comments | Continue Reading