Friday 22 March 2013

Non-linear Data Transformation

Linear Data Transformation

My last two posts on Data Transformations and Normalization Transformations were concerned with transformations that can be grouped together under the common title: linear data transformations. Such transformations have many use cases and are quite important. More often, though, I find myself searching for ways to skew the input data set to either
  1. favour the high end of the data or
  2. favour the low end of the data,
but without changing the order of the data points.
Here is an example and some explanations for which I use the sample data from my Data Transformations post, containing all integers from and including 0 to and including 10 but with a normalization applied. This results in the following data set:

Intention

What do I mean by "favouring the high end data"? Of course, in absolute terms, the high end data is already favoured as much as possible. The maximum value of 1 could not possibly be increased while still retaining the normalization range. Instead, I really want to decrease the low end data, thus highlighting the high end data more dominantly in terms of relative distances. Of course, decreasing data points below 0 is not an option as well, as that would lead out of the normalization range.
I needed some other, non-linear transformation to achieve this output data:
Conversely, what do I mean by "favouring the low end data"? Just like above, I mean to increase the relative distances of data point but this time in the low end spectrum of the data set. This is what the output data should look like:

Properties

As you can see from the two pictures above, the order remains untouched by the transformation. The absolute distances are of course changed and the relative distances are changed as well. Since the relative distances are not changed in a proportional way, the transformation is non-linear. When I set out looking for a data transformation that does this, I wanted to find a function that utilizes a parameter with which the intensity of skewing the data towards either the low end or the high end can be modified. Also it was important to me to specify the corner cases:
  1. There is a maximum intensity, that skews the data infinitely towards the high end. Looking at the example picture above and taking the idea to an extreme, this means that at maximum intensity, the transformation turns into a constant transformation with the constant 0.
  2. There is a minimum intensity, that skews the data infinitely towards the low end. Analogous to the maximum intensity extreme, the minimum intensity extreme turns the transformation into a constant transformation with the constant 1.
  3. There is a well specified "normal" intensity which does not change the input data, thus turning the transformation into an identity transformation.
Here is a picture of what that should look like:
The range of the intensity parameter is 0 to 1 and the specific intensities for the transformations above are 0, 1/20, 1/4, 1/2, 3/4, 19/20 and 1.

Mathematica

The following function does almost all of this:
The parameter i is the intensity and x is the input data. Sadly, this function has two undefined points at:
  1. x = 0, i = 0 
  2. x = 1, i = 1.
These are the two points when either the low end data point has to jump up to 1 in order to satisfy the minimum intensity or the high end data point has to jump to down 0 to satisfy the maximum intensity. Luckily, the conditions are easy to catch and you can just return the defined result. In Mathematica I use this definition:

Saturday 16 March 2013

Normalization Transformations

This is a small follow-up on my post about Data Transformations, adding two easy transformations regarding data normalization. To highlight the main point of these two transformations, I'll consider this data set, containing twelve numbers:

Normalization

Many models are much easier to design if the input data is provided in the range of 0 to 1. I will assume that all data points are positive for purposes of brevity. Data normalization is performed by multiplying every data point with the multiplicative inverse of the maximum data point.
f(x) = x / max
Not surprisingly the order remains the same, as do the relative distances. The absolute differences are scaled by the same amount as the data points themselves. This follows from the fact that it is basically a linear data transformation with a very specific multiplier.

Range Normalization

If I wanted my mathematical model to place special focus on the differences of the data points then there is some more room for improvement. In the data set, every data point is greater than or equal to 50, which means that 50% of the normalization range 0 to 1 is wasted on data that is not actually worth considering for the differences of data points.
Conceptually, I perform an offset transformation with the negative minimum and a normalization with the new maximum. Practically, it is easier to calculate minimum and maximum from the original data in one sweep and use the combined range normalization transformation:
f(x) = (x - min) / (max - min)
This transformation preserves the order of the data points but modifies all relative and absolute distances. Most importantly, the output data is guaranteed to contain the data points 0 and 1, which means that the data is spread as best as can be, in the range of 0 and 1.
Interestingly, this normalization technique is also appropriate for normalizing data sets containing negative data points into the range of 0 to 1.

Wednesday 13 March 2013

Data Transformations

I started writing a post about my most recent work on mathematical models but couldn't quite get started because I felt that wherever I stepped, I had to explain something in more detail or define some words that I wanted to use in order to make a succinct statement. There are some preliminaries to be covered before my work makes any sense. Unfortunately, this also means that this post is rather dull. Have a good flight over it ... there should be no surprises here.

What do I mean by data transformation?

Whenever I try to develop a mathematical model for a certain phenomenon I experiment with functions that manipulate input data in such a way as to properly map onto my expected output result. Thus by data transformation I am talking about a function, that maps from the input data format to some data format and changes the value in some meaningful way. To make it blatantly clear, here is a trivial example:
f(x) = 2 * x
The important thing about these transformations is not that they are mathematical simple but how they behave for sets of input data points. Does a given transformation
  • ... preserve the data point order?
  • ... reverse the data point order?
  • ... temper with the order in more complex ways?
  • ... change the distances between data points?
  • ... change the relative distances between data points?
  • ... highlight specific parts of the data set?
There are many ways to transform data and in this post I want to cover four very easy and fundamental ones:
  1. Constant transformation
  2. Identity transformation
  3. Offset transformation
  4. Linear transformation
To investigate these functions I use this simple data set. It contains all integers from and including 0 to and including 10.

Constant transformation

This one is so simple that, often, I overlook it.
f(x) = c
This is the great equalizer. It levels the playing field and makes all data points the same. Here is an example for c = 4 applied to the example data set.


In itself it is not an important transformation and has no other surprising properties, except being an equalizer. But it serves well to be aware of this transformation as an extreme case for other, more complex transformations as well as its part in combined transformations.

Identity transformation

Another very simple transformation is:
f(x) = x
Judging by its interestingness, I'd say it is even below the constant transformation. But again, it should be ever present in my mind for being a most interesting special case of more complex transformations.
Let's make the obvious statement that, since the output is identical to the input, it preserves order and differences and move on.

Offset transformation

Still nothing too fancy here. This transformation just shifts the input data set by a constant value.
f(x) = x + c
It preserves the order and the absolute differences, although it changes the relative differences, and is, overall, rather boring.

Linear transformation

To anyone with at least a tangent grasp on mathematics, it is no surprise: the identity transformation is only a special case of the linear transformation.
f(x) = m * x
Let's split some hairs here: Of course I could add a "+ c" at the end of that definition and be mathematically more correct. But for me it is more important to be principally concise.
Each of these transformations has a specific task to perform on the input data. Of course, mathematically, the constant, identity and offset transformations are just special cases of the general linear transformation, but then, the linear transformation is just a special case of other, more complex transformations. In the end, I could only talk about the most abstract and complex transformations that subsume all other cases, at the risk of having lost everybody not willing or able to follow me to abstract fairy land. I'll stick to calling them individually and remain content that I can combine them any way I want, to reach more complex forms.
A simple example for a linear transformation with m = 0.4.
Obviously, it preserves the order and the relative differences, but it modifies the absolute differences by multiplying them with m as well.
It gets slightly more interesting with a negative value for m which reverses the order. I set m = -0.7 in this example.
 
Not surprisingly, as with a positive m, this changes all absolute differences and because m is negative flips their sign. Relative differences are preserved, though.

Friday 8 March 2013

Getting 0 To ∞ On A Linear Slider

Problem

I use Mathematica for all my attempts at developing mathematical models for all kinds of problems. The beautiful thing about Mathematica is its ability to provide custom user interface boxes that allow fiddling around with the parameters in those models.


This shows a generated user interface with a slider for changing the value of "m", which is the slope in the depicted function. But the slider has a maximal value (100 in this case) and therefore you could never conveniently explore the whole range of values for "m". Every concrete value for "m", no matter how high, would always be just at the beginning of the slider.

Solution

What I needed was a function that would map the interval 0 to ∞ on a linear scale. The actual linear scale doesn't matter but the most versatile one is, of course, a normalized scale from 0 to 1. One function that does this is:
f(x) = x / (1 - x)
and it looks like this.
 The function has these properties:
  1. at x = 0, it returns 0
  2. at x = 0.5, it returns 1
  3. for x approaching 1 it approaches infinity
  4. at x = 1 it is undefined

Mathematica

In Mathematica I use the following definition, which makes it easier to use in the context of parameter boxes as mentioned above.
This avoids the error with an undefined result at x = 1 and, in this context, is the correct interpretation of the undefined value.