I recently had the
pleasure to give a talk about the information dimension of stochastic
processes at the Workshop on Causality and Dynamics in Brain Networks which was held in conjunction with the
Int. Joint Conf. On Neural Networks in Budapest. Our paper on this
work is published in the IEEE Transactions on Information Theory and
available on arXiv. The slides of my talk can (as always) be accessed
by clicking on the image below.

This work on
information dimension means a lot to me, and I’d like to let you
know how it all started. I had come into contact with Renyi’s
information dimension in 2012, when I worked on the information lossin principal components analysis. Back then I was considering only
random variables, but in the summer of 2013, when I was investigating
the information loss in anti-aliasing filters, I had the feeling that
there should be a similar definition also for stochastic processes. I
somehow managed to get the results published at the Int. Zurich
Seminar in 2014, even though I have to admit that some of the more
interesting results were obtained only with the help of two
Assumptions, which would have been theorems had the results on
information dimension already existed. I was certain at this point
that a proper definition of the information dimension of a stochastic
process must be connected with the bandwidth of the process.

After my PhD, I
spent some time at TU Munich as a postdoc. My Schroedinger grant from
the Austria Science Fund allowed me to work on information-theoretic
reduction of Markov models. The first year though, which was funded
directly by my boss Prof. Gerhard Kramer, I set out to work on “Two
Little (?) Problems” - at least that was the title under which I
presented these two problems to my colleagues during our winter
retreat in the beautiful village of Stilfs in early 2015. The two
problems were the fractal properties of polar codes (which are an
entirely different story) and the information dimension of stochastic
processes. While my work on polar codes progressed quickly, I was
stuck on the other problem. In the meantime I have found out that researchers published a possible definition of information dimension for stochastic processes. But I was not able to pursue this further until I met Stefan Moser and Tobi Koch
at the Int. Zurich Seminar in 2016 (where I presented my work on
polar codes). I’ve met both of them before and I read some of their
works – I just knew that they were able to help me.

Long story short,
Tobi was strongly interested and invited me to stay with him at
Universidad Carlos III de Madrid later this year. The stay was
amazing: Late breakfasts, dinners at 9 pm, a dry 40 degress weather,
and ice cold beer at night to cool down. Most importantly, we got a
good way towards proving the connection between information dimension
– defined differently than previously suggested –
and bandwidth, at least for Gaussian processes.
Over the coming
months our results were refined and extended, and now it feels as if
there is not much important left to do. It seems as if both of my “Two Little (?) Problems” are solved now.

# Entropy of Random Talks

A collection of mathematical curiosities, chosen purely based on my own interest. Be prepared for a little information theory, combinatorics, and probability! (This blog is trackable: 5ZVNC2)

## Monday, July 22, 2019

## Tuesday, June 4, 2019

### Talk at LSIT 2019

On May 30

Unfortunately, my stay at this symposium was the shorted I ever had (and, hopefully, will ever have): I got notice on the morning of my talk that my wife and my son fell sick, so I decided to fly back right after my talk to support them as best as I can. Apparently, the universe decided at the same time to make my trip back home as complicated as possible: The mobile website of Austrian Airlines claimed that my last name is invalid (whatever that means), a two-mile run to get my luggage from the hotel that made me all sweaty, and a fire alarm right in the middle of my talk overthrew the conference schedule. I still managed to hold my talk – it would not have been possible without the generous help of the organizers and the kind understanding of the entire audience.

Leaving a conference right after the talk is rude; it does not give your colleagues the opportunity to discuss your own ideas offline over coffee (or beer). Even worse, it can be seen as an expression of the disinterest in the talks of your colleagues. In my case, leaving the conference so early made me sad in one more way: I had to leave a group of people – information theorists – that I consider my academic family (and many of which I consider even friends). Only my own family could make me do that – and I know that the attendees of the London Symposium understand. Thanks!

^{th}, I had the great pleasure to give a talk at the 5th London Symposium on Information Theory. The symposium is a revival of a conference series that was started in the 50s and 60s, with notable speakers such as Shannon and Turing. As back then, this year’s LSIT was jointly organized by Imperial College London (Deniz Gündüz) and King’s College London (Osvaldo Simeone). It was a great honor to be one of the invited speakers, and I was happy to talk about the potentials and pitfalls of training neural networks to minimize the information bottleneck functional (joint work with Ali Amjad from TUM). The paper accompanying this work is accepted for publication in the IEEE Transactions on Pattern Analysis and Machine Intelligence (but you can also find it on arXiv). If you are interested in the talk, as always you can download it by clicking on the image below.Unfortunately, my stay at this symposium was the shorted I ever had (and, hopefully, will ever have): I got notice on the morning of my talk that my wife and my son fell sick, so I decided to fly back right after my talk to support them as best as I can. Apparently, the universe decided at the same time to make my trip back home as complicated as possible: The mobile website of Austrian Airlines claimed that my last name is invalid (whatever that means), a two-mile run to get my luggage from the hotel that made me all sweaty, and a fire alarm right in the middle of my talk overthrew the conference schedule. I still managed to hold my talk – it would not have been possible without the generous help of the organizers and the kind understanding of the entire audience.

#LSIT is back in session after a short break due to fire alarm 🚨 pic.twitter.com/SXSVvaOytr— Deniz Gunduz (@DenizGunduz1) May 30, 2019

Leaving a conference right after the talk is rude; it does not give your colleagues the opportunity to discuss your own ideas offline over coffee (or beer). Even worse, it can be seen as an expression of the disinterest in the talks of your colleagues. In my case, leaving the conference so early made me sad in one more way: I had to leave a group of people – information theorists – that I consider my academic family (and many of which I consider even friends). Only my own family could make me do that – and I know that the attendees of the London Symposium understand. Thanks!

## Monday, April 15, 2019

### Talk at apc|m 2019

I recently attended the 19th European Advanced Process Control and Manufacturing Conference, held this year in the nice city of Villach, Austria. The conference hosts experts in semiconductor manufacturing from both academia and industry.

I had the pleasure to talk about our work on an information-theoretic similarity measure for patterns on analog wafermaps. Analog wafermaps depict electrical measurement values of devices on a wafer, and patterns on these wafermaps may indicate process deviations. Detection and classifying these patterns, and reacting appropriately, can prevent further such deviations and, consequently, yield loss. Our work, a collaboration between Know-Center and K-AI within the SemI40 project, makes use of a feature extraction pipeline that was recently accepted for publication in the IEEE Transactions on Semiconductor Manufacturing. If you are interested in the slides, just click on the image below.

I had the pleasure to talk about our work on an information-theoretic similarity measure for patterns on analog wafermaps. Analog wafermaps depict electrical measurement values of devices on a wafer, and patterns on these wafermaps may indicate process deviations. Detection and classifying these patterns, and reacting appropriately, can prevent further such deviations and, consequently, yield loss. Our work, a collaboration between Know-Center and K-AI within the SemI40 project, makes use of a feature extraction pipeline that was recently accepted for publication in the IEEE Transactions on Semiconductor Manufacturing. If you are interested in the slides, just click on the image below.

## Tuesday, November 6, 2018

### Data Science 101: Average Silhouette Coefficient

*Mouse*dataset:

We will next cluster this dataset into three clusters using k-means. Furthermore, we will evaluate both the clustering result from k-means and the groundtruth clustering (namely, one "head" and two "ears") by means of the ASC:

What we observe is quite interesting. First of all, it can be seen that k-means fails to detect the groundtruth clustering, even though the clusters are separated. (See also here; it is argued that k-means prefers clusters of similar size, where size is taken in a Euclidean sense and not in the sense of equal number of datapoints.) Second, and more important, it is shown that the ASC for the "wrong" solution is larger (i.e., better) than the groundtruth.

As a second experiment, we projected the Mouse dataset in three-dimensional space and evaluated the ASC for the groundtruth clustering:

As it can be seen, the ASC differs from the ASC of the same cluster assignment in two-dimensional space -- ASC depends on the dimension of the dataset.

All this of course makes sense by recognizing that the ASC is distance-dependent. Since distances change when a dataset is projected in some higher-dimensional space, it is not surprising that the ASC changes as well. Furthermore, since k-means is a distance-based clustering technique, it is not surprising that the ASC of a k-means clustering is high. And finally, ASC will be a good indicator of cluster validity if the clusters in the dataset are distance-based (and not, e.g., density-, model-, or graph-based).

Related to this, in "Understanding of Internal Clustering Validation Measures" it is shown that k-means performs worse than Chameleon (Figure 6) on a very similar dataset (Figure 5); at least using Chameleon, the ASC is maximized by the correct number of clusters. This paper and the short analysis presented in this entry lead to the following questions:

- Based on what cluster assumptions (distance, density, etc.) are different internal validation measures defined?
- Given any internal validation measure, can we find a synthetic dataset for which the groundtruth clustering has a bad value, while an "obviously wrong" clustering has an extremely good value? I.e., can we find pathological examples for which a given internal validation measure fails? (This entry shows that the answer is positive for ASC.)
- Given these pathological examples, can we show that their properties are in contrast with the cluster assumptions inherent to the considered internal validation measure?

Answering these questions will improve our understanding of these internal cluster validity measures and will help us choose the correct validity measure.

Subscribe to:
Posts (Atom)