The tools below are used to classify Mathematical Theorems from the Wikipedia Page List of Theorems . Access the full Python Notebook here , or an executive summary here
Below are the PowerPoint slides that describe the MathNet classifier, and they can be downloaded here.
Tex Trimmer
A LaTeX trimmer needed to be created in order to convert the Math theorems into simple text for training models and analysis. The following is a wikipedia page of Wold's Theorem for Statistics. The trimmer works by removing any Tex formatting commands as well as brackets used in the formatting of a tex document. By clicking submit, this page changes into the trimmed version of the document. Reset changes it back.
Common Words per Field
Each of the fields of mathematics were classified by their most common words in all of the related Theorems. To see the most common words in both word cloud and bar graph form use the objects below. Select a field from the drop down box and then click submit to populate these graphs.
Field of Mathematics Classification
A classifer was built in order to determine which field of Mathematics most likely represented a given Theorem. In order to determine the best Theorem, different models were tested and a Grid Search was utilized to obtain the best Hyper-Parameters. The ROC curves, accuracy, plots, and grid search visualization are shown below.
Select a Mathematical Theorem from the drop down box and click to submit to see what field of Mathematics that the algorithm classifies that theorem in.
Beautiful Theorems
Following an article on the most Beautiful theorems of Mathematics from www.quora.com, the following illustrates some traits of all these theorems as a collection. Quora asked people to consider the following when selecting theorems for the list
Category | Description |
---|---|
Generality | It is applicable to a wide variety of problems. |
Succinctness | It is expressible simply, in only a few words or equations. |
Originality | It expresses a surprising mathematical insight, or a connection between different areas of mathematics, that had not previously been widely suspected. |
Significance | It represents an important advance in mathematical knowledge, or resolves an important mathematical problem. |
Potency | It stimulates many new areas of mathematical research. |
Centrality | It is used in the proofs of many subsequent theorems. |
Independence | Its proof depends on only a small number of previously established theorems, and preferably none. |
Based off of this criteria, users of Quora found the following Theorems to be the most Beautiful. The Pythagorean Theoem (Geometry, Pythagoras), Euclid's Theorem of the Infinitude of Primes (Number Theory, Euclid), The Minimax Theorem (Game Theory, John von Neumann),The Brouwer Fixed Point Theorem (Topology, Luitzen Brouwer),Cauchy's Residue Theorem (Complex Analysis, Augustin-Louis Cauchy),Fourier's Theorem (Function Theory, Joseph Fourier),The Halting Theorem (Computability Theory, Alan Turing),Gödel's Incompleteness Theorems (Mathematical logic/Metamathematics, Kurt Godel),Schubert's Prime Knot Factorization Theorem (Knot Theory, Horst Schubert),Cantor's Theorem (Set Theory/Transfinite Analysis, Georg Cantor),and the Fundamental Theorem of Algebra (Algebra).
Considering these Theorems, the most common words of all of them were plotted in order to try and define which phrasing leads to the most beautiful theorems in Mathematics. The most common words are given below.
Then the SVM classifer was used on these most common words to determine which Field of Mathematics most encompasses the class of Beautiful Theorems, and the field that most represents all the Beautiful theorems as a whole is Mathematical Logic. A fitting classification, as all of Mathematics pulls from this field. Hence, Mathematical logic would likely be a strong contender in a more subjective classification of the most beautiful theorems. Moreover, Mathematical Logic should be one of the most Unique, Central, Original, and Independent fields as it is the base from which all Mathematics is built upon.
Unclassified Theorems
In the pre-processing step, we removed any theorems that were a part of some obscure field of Mathematics (per Wikipedia). The following will try to re-classify those Theorems into more appropriate fields so that they can be included with their fellow theorems. For example the field Quantum Theory really should be lumped in with Physics, and 'Several Complex Variables' is just Complex Analysis. Below shows the predicted class of a Theorem or Field. Select either a field or a theorem title and this will display the page for that Theorem as well as the predicted class based on the model.
From the table above we see that the predicted fields sometimes do not agree with what we would guess subjectively. It is not surprising that subjects like Lie Algebra, Queuing Theory, and Mathematical Series get predicted as Abstract Algebra, Stochastic Processes, and Analysis respectively. However, the classification of fields like Axiom of Choice, Neural Networks, and Quadratic Forms as Model Theory, Partial Differential Equations, and Number Theory, didn't agree with our first guesses of their parent fields. This is a demonstration that this isn't a classification of the field itself, but the theorem that Wikipedia classified as that original field. For example the Neural Network theorem was the "Universal approximation theorem" which describes
In the mathematical theory of artificial neural networks, the universal approximation theorem states that a feed-forward network with a single hidden layer containing a finite number of neurons (i.e., a multilayer perceptron), can approximate continuous functions on compact subsets of \(\mathbb{R}^n\), under mild assumptions on the activation function. The theorem thus states that simple neural networks can represent a wide variety of interesting functions when given appropriate parameters; however, it does not touch upon the algorithmic learnability of those parameters.
One of the first versions of the theorem was proved by George Cybenko in 1989 for sigmoid activation functions.
Kurt Hornik showed in 1991[3] that it is not the specific choice of the activation function, but rather the multilayer feedforward architecture itself which gives neural networks the potential of being universal approximators. The output units are always assumed to be linear. For notational convenience, only the single output case will be shown. The general case can easily be deduced from the single output case.
Now the wording of this quote starts to provide insight into why this was classifed as a PDE theorem. This provides some valuable uses for this prediction algorithm. First of all, the publisher of these Theorems only thought of the single use of the intended Theorem, and using this prediction model we can find other fields of Mathematics that these Theorems support. Moreover, if someone studies a very specific Mathematical idea, by predicting which Field this falls under, it opens up a source for other similar ideas under that field. Under the same idea, this allows publishers of Mathematics papers to consider who else might be interested. For example, the Neural Networks researcher would realize that PDE researchers might be interested and publish in an appropriate journal to reach them.