Tag Archives: Database

Local Indices on Partitioned Tables

I wrote an article lately about a simple model to understand indices in databases. It may help to read the other article first.


In a data processing project I worked on, the data is put into an Oracle database. The data is queried differently based on questions to be answered. Even worth, the queries are not known upfront, so a quick enough access is to be provided for all kind of usage scenarios.

During the design phase some decisions were made based on the information at the time. As the main point of view onto the data was time series based, the decision was made to use partitioned tables with local indices based on the time stamps. In the aftermath, that was a mistake, which was corrected.

Let’s assume, we want to store two years of data and the partition key is based on the year and week. For instance week 50 in 2017 has  the partition key 1750. Let’s assume we store data now for 2016 and 2017, we need about 104 partitions with partition keys 1601, 1602, …, 1652, 1701, 1702, …1751 and 1752. For the sake of simplicity for the next calculations, we assume we have in total 100 partitions. Each partition has local indices on different fields.

What is wrong with this scenario now? As long as we can directly access the table with partition keys, there is nothing wrong. We can create selects like:

Oracle knows in which partition to look at and which index to use to find the name. But, what happens in case we want to look up the name for the whole time? The query could be something like:

In this case we get terribly slow. But why?

Let’s calculate the time characteristics in Big O Notation:

Let’s assume we have a total number of N samples in the whole time series. In trivial, unpartitioned case, without any additional index, we have the time characteristics:

t(N) = O(N)

If we put an index onto the table which can be utilized, we get a much better time characteristics of:

t(N) = O(log2(N))

That’s actually, why indices are used. The speedup S is :

S(N) = O(N) / O(log2(N)) ~ N / log2(N)

If N is only 1, there is not speedup at all, but for let’s say 1000 samples, the speedup is already about 100. That’s great.

But, what happens in case of partitioning? Let’s assume we have in total P partitions. If we can use the partition key only without any additional index, Oracle just accesses the one partition and we get a behavior of

t(N) = O(N/P)

with a total speedup of

S(N) = O(N) / O (N/P) ~ P

In our example above, this is a speed up of a factor of 100 resulting in a 1/100 of query time, because we only need to access 1/100th of the data. Not bad at all.

The same characteristics we can calculate with an additional index. In this case we get:

t(N) = O(log(N/P))

What happens now in case we search for name over the whole time series? Because, we need to read the whole table again over all partitions with a local index only, this comes down to:

t(N) = O(P * log(N/P))

The problem is, we have to look P times into each partition and can use the local index only. This is P time utilizing an index. Compared with an index on unpartitioned table or with a global index, the speed up is is much smaller that one. In our example above with 100 partitions, this leads to:

S(N) = O(log(N)) / O(100*log( N/100) ~ 1/ 100

The reason is that the partition number increases linearly, but the gain of the performance due to the index is increasing much slower.

This calculations show, that in cases of random access on partitioned tables, global indices are recommend to not slow down the queries.

Simple Model for Understanding Database Indices

I had a lot of discussions in the past about databases and indices. The type of the database is not  important, as the general design of indices are always the same. The following model can be applied to relational databases, document stores, big table databases, columnar stores and also to graph databases.

The following model is used to explain what indices are and why they work so well and especially why there are certain restrictions to them:

  1. Multiple single value indices do not help efficiently in where clauses with AND conditions.
  2. For combined indices, the where clauses need to have certain AND clauses to get fully utilized. Missing some of the clauses may lead into not using the index.
  3. The order of values in combined indices are important, when time series or continuous data is involved. It is shown what the correct order is to get good performance.

These three points, I have to discuss over and over again, as most people are not database specialists and the inner works of databases are not well known. The following simple model shall provide some insights.

Storing Data in General


Storing data itself is normally quite simple, just put the stuff to disk. I know, it is not meaningful, but as starting point, let’s assume it for now. It is also not important whether the data is put onto disk, into a storage like Hadoop or something else.

For the different databases we can assume we have basic data structures like

  1. Relational Database: Row = Map<ColumnName, Value>
  2. Big Table Database: ColumnGroup = Map<Key, Value>
  3. Document Store: Document = <any data structure like JSON>
  4. Graph Database: Vertex, Edge = <any structure>

Important for now is, we have different kinds of data structures, but it is not important for now, how they look like. We just put them onto the disk as a simple binary file.

If we do this, there is only one thing missing: We can’t distinguish them, that’s why we define a data store with Object IDs:

DataStore = Tuple<ID, Object>

We have an object ID which is unique defining a certain Object. We can read now the store Object by Object and can find the correct one by comparing the ID.

Primary Index

With the store above, we can store any data and can retrieve it via the unique ID. The issue is, to find the related Object, we need to read the whole store. This leads to a time characteristics in Big O Notation with a total number of N samples of:

t(N)=O(N)

This is a poor characteristics. This can be improved dramatically by adding a primary index. This primary index maps every ID to the storage position of the Object:

PrimaryIndex = Map<ID, DataFileAndPosition>

Do not mistake this primary index with a primary key from a relational database. Here we talk about a technical index to retrieve data from disk efficiently. The primary key in a relational database is logical construct on the database content for data organization and modelling.

As soon as we have a primary index in place, the time characteristics changes to:

t(N) = O(log(N))

We assume here, that the index is ordered by the IDs and searchable with a binary search.

The speedup S is:

S(N) = O(N) / O(log(N)) ~ N / log(N)

For N = 1000 is around 100 and for N = 1,000,000 it is around 50,000.

For a simple key values store, this would be good enough. But, what do we do if we look for a specific Object without knowing its ID? We still have to search the whole data store.

Single Value Secondary Indices

To find objects with certain attributes, values or properties (the name depends on the type of database), indices are used. For this to work, we need to have the possibility to extract this Values out of the stored Object:

Value = Object(Name)

The Name in a relational database is for instance the column name and for a graph database a property name. To find the value, we need to have an index like:

Index = Map<Value, Set<ID>>

What about the time characteristics? It was

t(N) = O(N)

before. Now, we can do an index lookup first, which has a time characteristics like the primary index:

t(N) = O(log(N))

This is pretty good and with the additional time characteristics of the primary index, we get an overall time characteristic of:

t(N) = O(log(N)) + O(log(N))= 2 * O(log(N)) = O(log(N))

That’s great, isn’t it? Again, we assume, we can access the Value with a binary search.

Definition of Cardinality and Selectivity

For a better understanding of the next chapters and for the sake of simplicity, we need to explain two terms: Cardinality and Selectivity.

Cardinality C, is the number of unique values. That means, C <= N for a single attribute. In case the Cardinality is known (an index should keep track of it),the number of returned results can be estimated as:

NumerOfResults = N / C

The simple idea is, that the number of each unique value is equal. The factor 1 / C can be defined as Selectivity like:

Selectivity = 1 / C

Single value indices in AND clauses

With the explanations above in mind, we can explain why multiple single value indices cannot be combined. Let’s assume a SQL query like

Additionally, we assume to have single value indices on both columns. What can the database do now?

The first implementation would normally find that both columns have a single value index and uses the index with the better Selectivity (which filters most and returns least number of results). It takes the IDs out of it, reads the Objects and filters the results again with the second conditions. In this case we have the same behavior as with one index only.

A second implementation could check both available indices and takes the ID sets out of both. From both sets the intersection is calculated and with the result the Objects are read. Depending on the amount of effected IDs and the overall number of Objects, this might be less efficient, than the first approach.

A real solution is only a combined index.

Combined Secondary Indices

A combined index is just an extension of a single value index with multiple values:

CombinedIndex = Map<Attribute1Value, Map<Attribute2Value, … Map<AttributeNValue, Set<ID>>…>>

Combined Indices and Where Clause

With the definition of the combined index above, we see why it is important to add the correct where clause to a query. Let us define a combined index over three attributes:

CombinedIndex = Map<Attribute1Value, Map<Attribute2Value, Map<AttributeNValue, Set<ID>>>>

As long as the AND clauses in the query are added from attribute1 to attribute3 in the correct order (the order in the query statement itself should not matter), the index can be utilized. For instance, the quere

is optimal. It runs the query directly on the index and finds all relevant rows.

The same holds true for the queries:

The only difference is, that only the first two or the first map can used to find the set. The remaining maps are scanned. All relevant rows are found with the index and the Objects can be read.

But, what about the following queries?

The index cannot be used efficiently, because the first Map needs to be scanned completely. Some benefit may be gained from the other maps, but it is not said, that the database can do it.

That’s why it is important to put some thoughts into the correct definition is combined indices. If the last queries are dominant, the order of the attributes should be changed. Maybe, an additional index might make sense. It heavily depends on the use cases expected.

Time Series and Continuous Data

For time series data and continuous data, a special look is needed. For this kind of data, a check for equality makes sense only in rare cases, but creating an index make sense anyways.

Continuous data (also time series) is queried with inequality to get defined ranges of the data. In case an index was created, a range scan can take place. With the same binary search algorithm the start of the range can be found and afterwards all values before the end are used to retrieve the IDs. We assume here again, that the Values are ordered in the index and a binary search can take place.

Continuous Data in Combined Secondary Indices

Let’s assume we have a table with two attributes Time and Name. What is the best order for the combined index? The query could look like:

For this example let’s assume the selectivity of name is 10% and also the time range is 10%.

If we create the index in the order

Index = Map<NameValue, Map<TimeValue, Set<ID>>>

the database can narrow the search with the name already to 10% of the index and can perform a range scan over the second map. This results into an index read of 1% of the size only.

If we create the index in the other order

Index = Map<TimeValue, Map<NameValue, Set<ID>>>

the database can perform a range scan over the first map. But, because time is a continuous value, the name will be most likely unique and for each time value the name needs to be checked again. This leads into a less efficient retrieval of data and a read of 10% of the index.

In case continuous data is to be put into an index, it is to be expected that it is most efficient if it is put at last position.

Short Summary

The explanations above are maybe not the most accurate, but for me they served quite well to understand some effects I experienced.

Another article about partitioned tables is in preparation. With these some other effects are to be considered.

eXtended Objects: ORM for No-SQL Databases in JavaEE

Since the appearance of object relational mappers (ORM) the programming of persistence layers became easier and less error prone. ORMs store objects in SQL databases by interpreting the object’s fields as columns of a table to store them. With the standardization of the Java Persistence API the success was programmed and a lot of Java EE and also Java SE projects rely on ORM.

For about ten years now, a new class (or better new classes) of databases appear which were developed in large companies like Amazon, Google and Facebook to solve storage issues which cannot be handled by SQL databases efficiently. These databases are divided into the classes: Big Table Databases, Graph Databases, Key-Value Databases and Document Databases.

This article is not about the databases itself, but about another issue: ORM is for SQL databases and does not work very well with No-SQL databases. Additionally, the ORM implementations for Java are, as far as I know, all JDBC based, which is also not very handy when it comes to migration to other databases which do not have a JDBC like driver implementation. A new approach needs to be thought of and also some short comings need to be extended to overcome struggles still in ORM.

A new approach comes with a project called eXtended Objects. Version 0.4.0 of eXtended Objects was released on Github: https://github.com/buschmais/extended-objects and even it is a quite stable release and the first implementations for databases are out, it is worth mentioning it.

eXtended Objects in JavaEE

For a PureSol Technologies’ project the Titan database (http://thinkaurelius.github.io/titan) is used and therefore, a Titan storage back end was developed (https://github.com/PureSolTechnologies/extended-objects-titan) for eXtended Objects to make it work with Titan. The current release  is version 0.1.1 and it is still based on eXtended Objects 0.3.0, but a new release may follow quickly for version 0.4.0.

Due to the release is hosted in Maven Central, a simple

is suitable to get eXtended Objects for Titan for Maven.

eXtended Objects is well suitable for JavaSE, but in JavaEE the current approach to get the mapper working is not suitable for that purpose we want to use @Inject to get a new xoManager. This can be done via a producer method (and also a destroyer method to close the xoManager). This may be another blog in future.

Supported Databases

As described in the features section already, several data store implementations are already out, in development or planned. One of the first implementation was done for another graph database to check the eXtended Objects SPI and to proof the concept.Due to the current usage the first  implementation was done for the Titan graph database which is provided by PureSol Technologies.

Other implementations are planned or already available in development state. Here is the current list of (known) projects:

  1. eXtended Objects for Neo4j by Buschmais GbR (stable): https://github.com/buschmais/extended-objects (included in eXtended Objects base library)
  2. eXtended Objects for JSON by Buschmais GbR (development): https://github.com/buschmais/extended-objects (included in eXtended Objects base library)
  3. eXtended Objects for Titan by PureSol Technologies  (development): https://github.com/PureSolTechnologies/extended-objects-titan (project page: http://opensource.puresol-technologies.com/xo-titan)
  4. eXtended Objects for Blueprints (development): https://github.com/BluWings/xo-tinkerpop-blueprints