What do I mean by data transformation?
Whenever I try to develop a mathematical model for a certain phenomenon I experiment with functions that manipulate input data in such a way as to properly map onto my expected output result. Thus by data transformation I am talking about a function, that maps from the input data format to some data format and changes the value in some meaningful way. To make it blatantly clear, here is a trivial example:f(x) = 2 * xThe important thing about these transformations is not that they are mathematical simple but how they behave for sets of input data points. Does a given transformation
- ... preserve the data point order?
- ... reverse the data point order?
- ... temper with the order in more complex ways?
- ... change the distances between data points?
- ... change the relative distances between data points?
- ... highlight specific parts of the data set?
- Constant transformation
- Identity transformation
- Offset transformation
- Linear transformation
Constant transformation
This one is so simple that, often, I overlook it.f(x) = cThis is the great equalizer. It levels the playing field and makes all data points the same. Here is an example for c = 4 applied to the example data set.
Identity transformation
Another very simple transformation is:
f(x) = x
Judging by its interestingness, I'd say it is even below the constant transformation. But again, it should be ever present in my mind for being a most interesting special case of more complex transformations.
Let's make the obvious statement that, since the output is identical to the input, it preserves order and differences and move on.Offset transformation
Still nothing too fancy here. This transformation just shifts the input data set by a constant value.f(x) = x + cIt preserves the order and the absolute differences, although it changes the relative differences, and is, overall, rather boring.
Linear transformation
To anyone with at least a tangent grasp on mathematics, it is no surprise: the identity transformation is only a special case of the linear transformation.
Let's split some hairs here: Of course I could add a "+ c" at the end of that definition and be mathematically more correct. But for me it is more important to be principally concise.f(x) = m * x
Each of these transformations has a specific task to perform on the input data. Of course, mathematically, the constant, identity and offset transformations are just special cases of the general linear transformation, but then, the linear transformation is just a special case of other, more complex transformations. In the end, I could only talk about the most abstract and complex transformations that subsume all other cases, at the risk of having lost everybody not willing or able to follow me to abstract fairy land. I'll stick to calling them individually and remain content that I can combine them any way I want, to reach more complex forms.
A simple example for a linear transformation with m = 0.4.
Obviously, it preserves the order and the relative differences, but it modifies the absolute differences by multiplying them with m as well.
It gets slightly more interesting with a negative value for m which reverses the order. I set m = -0.7 in this example.
Not surprisingly, as with a positive m, this changes all absolute differences and because m is negative flips their sign. Relative differences are preserved, though.
No comments:
Post a Comment