Wolfram Blog

Meet the Authors of Hands-on Start to Wolfram Mathematica, Second Edition

Jeremy Sykes — Tue, 24 Jan 2017 17:29:49 +0000

Jeremy Sykes: To celebrate the release of Hands-on Start to Wolfram Mathematica and Programming with the Wolfram Language (HOS2), now in its second edition, I sat down with the authors. Working with Cliff, Kelvin and Michael as the book’s production manager has been an easy and engaging process. I’m thrilled to see the second edition in print, particularly now in its smaller, more conveniently sized format.

Q: Let’s start with Version 11. What’s new for Version 11 in HOS2 that you’d like to talk about?

Michael: As with any major Mathematica release, there are more new things to talk about than can be discussed in the time we have available. But I’m getting a lot of use out of the new graphics capabilities—the new labeling system, the ability to have callouts on a graph, word clouds, enhanced geographical visualizations and even things you don’t think about, such as the removal of discontinuities when plotting things like tan(x). The second edition of the book also includes updates for working with data, like the ability to process audio information, working with linguistic data, new features for date computation and preparing output for 3D printing. (Also new is an index, to make it easier to find specific topics.)

Q: Getting back to basics, I know that HOS existed in some form long before the book came out. Maybe you could fill out some of the history for us.

Cliff: Twenty years ago, I started at Wolfram on the MathMobile, traveling from city to city, visiting organizations (mostly universities—some companies and government labs as well). The MathMobile was a 30-foot trailer (connected to a truck) with three laptop stations where people would come into the trailer to see Mathematica in action in this mobile computer lab. My job was to walk people through how to get started with Mathematica, sometimes answering technical questions for existing users, and sometimes going through a first overview for non-users. Afterward, I worked in technical support and then in sales, and through these experiences, I had the opportunity to see many types of first-time interactions with Mathematica. Thus, my passion for helping people get started with Mathematica was initiated. Several years ago, I came up with the idea for a free video series showing people how to get started. That was extremely popular. From that, many requests for a book version of the video series came. Then we translated the video series into a book.

Q: Tell me a bit about the partnership behind CKM Media and how you came together for the project.

Cliff: In the late 1990s, Kelvin and I began working closely together on many Wolfram projects relating to academia. We found a lot of shared ideas and approaches to problems, bringing very different strengths to those projects. I tended to look at things from a liberal-arts-college perspective and that of a math student who had strong math skills but not a lot of programming experience. Kelvin often came at things more from the mindset of an engineer at a research university. We found that these different mindsets helped ensure that more members of academia were well represented in those projects. Michael started at Wolfram in the mid-2000s. The three of us worked closely together after his hire. Michael brought a computer science mindset with a focus on data analytics and programmatic solutions to real-world problems. So while we have been good friends for about fifteen years, we also bring such different skill sets to projects and feel we make for a great collaborative team for this book.

Q: What makes HOS a good Mathematica teaching tool?

Kelvin: I think it’s a good teaching tool from two perspectives—one, it’s extremely useful for teaching anyone how to get started with Mathematica. We’ve had lots of great feedback from students, teachers, professors and lots of different types of people in the government and commercial sectors. But also, it’s been a great tool for the classroom. Over the years, we’ve learned a lot from the free Hands-on Start video series. The comments and feedback from educators using the videos for their classes helped shape the philosophy of the book. What we wanted was a slow buildup of material that works well for non-users, and specifically for non-users without any coding experience. As the chapters progress, the examples get more intricate and more interesting, using multiple Mathematica functions. At the same time, we wanted the first few chapters to also show a complete sample project in Mathematica. Then, when syntax conventions are covered, they are framed with a discussion about why that convention is useful for a project. The second thing we wanted to do was show the scope of Mathematica and the Wolfram Language.

There are many good books and tutorials for learning Mathematica, but they often focus on one field or class. Our team wanted to provide a good foundation for how to use Mathematica and the Wolfram Language for a broad range of applications. Even Mathematica users who have focused on a select few functions could learn how to use Mathematica in new types of applications or projects. And it’s been fun to see the results so far.

The first edition has been a recommended or required text in classes like chemistry, economics, physics and mathematics, and in classes specific to teaching Mathematica or the Wolfram Language itself.

Jeremy: HOS2 is available from our webstore. It’s also available on Amazon. It’s available in the beautiful, perfect-bound 7×10 paperback copy, and also as a fully updated Kindle version. For those who buy the printed book, we have enrolled the book in Kindle’s MatchBook program, which allows you to buy the EPUB at a reduced cost. We also have plans to release on iTunes. For our international users, we plan to release translated versions of HOS2 in Japanese, Chinese and other languages.

Be sure to check out our upcoming series of Hands-on Start to Wolfram Mathematica Training Tutorials. Learn directly from the authors of the book and ask questions during the interactive Q&A. Visit Wolfram Research’s Facebook event page to learn more about upcoming events.

Exploring a Boxing Legend’s Career with the Wolfram Language: Ali at 75

Jofre Espigule-Pons — Tue, 17 Jan 2017 17:41:04 +0000
Muhammad Ali (born Cassius Marcellus Clay Jr.; January 17, 1942–June 3, 2016) is considered one of the greatest heavyweight boxers in history, with a record of 56 wins and 5 losses. He remains the only three-time lineal heavyweight champion, so there’s no doubt why he is nicknamed “The Greatest.”

I used the Wolfram Language to create several visualizations to celebrate his work and gain some new insights into his life. Last June, I wrote a Wolfram Community post about Ali’s career. On what would have been The Greatest’s 75th birthday, I wanted to take a minute to explore the larger context of Ali’s career, from late-career boxing stats to poetry.

First, I created a PieChart showing Ali’s record:

21, "KO" -> 11, "UD" -> 18, "RTD" -> 5, "SD" -> 1, "LUD" -> 2, "LSD" -> 2, "LRTD" -> 1|>; PieChart[bouts, ChartStyle -> 24, ChartLabels -> Placed[{Map[Style[#, Bold, FontSize -> 14] &, Values[bouts]], Map[Style[#, FontFamily -> "Helvetica Neue", Bold, FontSize -> 16] &, Keys[bouts]]}, {"RadialCenter", "RadialCallout"}], PlotRange -> All, SectorOrigin -> {Automatic, 1}, ChartLegends -> {"Technical Knockout", "Knockout", "Unanimous Decision", "Retired", "Split-Decision", "Lost - Unanimous Decision", "Lost - Split-Decision", "Lost - Retired"}, PlotLabel -> Style["Ali's Record", Bold, FontFamily -> "Helvetica Neue", FontSize -> 22], ImageSize -> 410]" title="bouts = <|"TKO" -> 21, "KO" -> 11, "UD" -> 18, "RTD" -> 5, "SD" -> 1, "LUD" -> 2, "LSD" -> 2, "LRTD" -> 1|>; PieChart[bouts, ChartStyle -> 24, ChartLabels -> Placed[{Map[Style[#, Bold, FontSize -> 14] &, Values[bouts]], Map[Style[#, FontFamily -> "Helvetica Neue", Bold, FontSize -> 16] &, Keys[bouts]]}, {"RadialCenter", "RadialCallout"}], PlotRange -> All, SectorOrigin -> {Automatic, 1}, ChartLegends -> {"Technical Knockout", "Knockout", "Unanimous Decision", "Retired", "Split-Decision", "Lost - Unanimous Decision", "Lost - Split-Decision", "Lost - Retired"}, PlotLabel -> Style["Ali's Record", Bold, FontFamily -> "Helvetica Neue", FontSize -> 22], ImageSize -> 410]" width="620" height="240" />

Ali was dangerous outside the ring as well as inside it, at least for the white establishment in the US. He converted to Islam and changed his name from Cassius Clay, which he called his “slave name,” to Muhammad Ali. Later he refused military service during the Vietnam War, citing his religious beliefs. For this, he was arrested on charges of evading the draft, and he was pulled out of the ring for four years. All this made Ali an icon of racial pride for African Americans and the counterculture generation during the 1960s Civil Rights Movement.

Perhaps a lesser-known fact about Ali is that he played an important role in the emergence of rap, and he was an influential figure in the world of hip-hop music. He earned two Grammy nominations and he wrote several poems, among which is the shortest poem in the English language:

“Me?
Whee!”

So let’s create a WordCloud of his most popular poems. First, I need to import his poems from a database site like Poetry Soup and do some string processing from the HTML file in order to get the poems as plain strings:

" ~~ x__ ~~ ""] -> x], {"\n" -> " ", Shortest["<" ~~ __ ~~ ">"] -> " "}];" title="poemsHTML = Import["http://www.poetrysoup.com/famous/poems/best/muhammad_ali", "Source"]; poems = StringReplace[ StringCases[poemsHTML, Shortest["
" ~~ x__ ~~ "
"] -> x], {"\n" -> " ", Shortest["<" ~~ __ ~~ ">"] -> " "}];" width="606" height="92" />

Here are the first three poems:

Then I get a list of the important words with TextWords and delete the stopwords with DeleteStopwords. Next, I style the word cloud with a boxing glove shape:

RGBColor], BoxForm`ImageTag["Byte", ColorSpace -> "RGB", Interleaving -> True], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSize->Automatic, ImageSizeRaw->{225, 225}, PlotRange->{{0, 225}, {0, 225}}]\), ColorFunction -> "SolarColors", ImageSize -> 500]" title="WordCloud[ StringDelete[ DeleteStopwords@Flatten[TextWords@poems], {"\[CloseCurlyQuote]", "ve"}], \!\(\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJztnb1rXMsZxk2SIo0gqFMlotKlGqcyBFVBnUNKF/cqyiVgfEE3IALq1KpW q8KdQZ1Lgf8AV24NKoLBnUCNGjcbP5u7aL3ZPfPOnJl5P87zwO/6WtjrPefM nJl5P//4488v/vGbJ0+e/PL7b/958cPpn09Ofvj3X//w7Td/e/3LP396ffz3 v7z+1/FPxyd/+vG33374n1/53ZO5ZoQQQgghhBBCCCGEEEIIIYQQQgghhBBC CCGEEEIIIYQQQgghhBBCCCGEEEIIIYQQQgghhBBCCCHrOD4+nr1582b28ePH 2ZcvX2bLuru7m//8+vp6dnJyMtvZ2VH/voRE4OzsbD63SvTp06fZ+fm5+jUQ 4o3Dw8PZ+/fvi+bdJl1dXc22trbUr40Qy2CveXt7W3Xurery8lL9OgmxBube 58+fm869ZeHs+OLFC/XrJkSb58+fz89sWnr79q36PSBEC9gvLQjr7+7urvr9 GOLg4GD28uXLub0XNibsp2EfBniP4Pf4+atXr+br+9OnT9W/M7ELxtLDw4P2 1Ps/4Xtp3xvMn4uLi9nNzc18f/D169dR14Sz9bt37+a2ZevvGdIHjAfLwljt eT+wbmE/0OssjH8Haybn4/R49uzZ3A7iQfBhtLwPmAOaZ+CFPnz4MPcBaY8N 0h6cYTzp/v6+6nlqf39/Pq9X43qsCPEPeDdojxPSBit2F4lw9qoZU4PP6ulv GSvaiOOBvY4XwQZSI4YG9kt8llfhvIB1W3vskPFY3XetqpavHvttC2e8Wjo6 OlIfQ6SM7e3t+XnKg+BbG3u92G96sTXlCjZb7fFE8oAdw6Lfb1WYM2PtgfDh ebjWseJ66AfMv7H+5B6Cf3LMdcJ/OIW5tyyeD+2D/af1cYn3w5g4GMSUeznj 1hauW3uMkWGsn//g/yq9NvjNSvOHI4n5XnaxvjaMiXmBv4x6FGuD2MPy+oC9 Z6nPAfZAD2fb3kK8hfaYI49Yjr1GbErJOxs+ek9xBRrSHnfkf8AnZlWIUSm5 Jtg7qbRYH0sf+NWsqtRuYHlPbU1j7FukDlZ9ECW+ZPgbeO7Ll/YYnDIWz0p4 J6AmTe61WD7PWhdrYemA2CxrQswZ4gNyrmNvb8+8P8W66CvsD2KVrAk5CrnX AZ8DNV6ldi9SjrU81BK7AHIkqDoqef+RcrDvsCTUv8+9Bto964rxo/1ALoQl 5eY84PtHze3TlvbYnAqW1o/cOCnY7qh20h6bU8CS/SI3350xL+2lPT6ngJV8 pNxaX9bOr1GlPT6jY2Uc557/6HfvJ+0xGh0Lyp1/tfuFUpsFO5f2GI2MhbzV XP+DxRi6yEIfGe1xGhXkz2kr1/9uyXY7FeGdl3ouyK8BqHnM/jJytOvS575f I9XU9aSUnQwx9JuEfSyeG+LdYHdAvgt7JT6iqdwzBtc/PSF+f+jZoH5drvD8 MS97956zBGofaQl+kJzaE1z/dJWqj1wjxwZ7InwO8ly050YvNHNzc86AXP/0 leqNU9tHhGceve7+6elp1XtWIuRmpL4n7Z/6kjynVvsU7Jei5i5i3bcg3ONN ddU9+d8j18eQ+G1bC/c30lxEHWlrWq2V4Cn3bxHbGjVXHz3erIwnvLOxh9Oe Q2Ox4JNfp8W9tVxHcVWr59qIsTup8aRR8wR7X8+9vK3WSYM87T9xH9fdXyux tzUkyZ/XfO943J+W+HGo9UI8yKb7jL11hDOiZIxrv9NhM/IUl+NpnbGslM8a wP9prS5PrlL7PbyHrMhL71Ltd1YESWInF3iuqyHxSWjGeayT9b2p5Zr1XrTp DLgO77EFkvFscZ0vqQHWC0/2fquS1puOcK9TsYTW6n8ty2qPDMZcjpO0xg3O Jd4l2W9bt/9arIlKlUtyNgIW8jFrSBKraXEfuipL8zDCu1lT0nw3KzGAY4RY lNR1WrKHpmRlX2rNfuVJUltblHssuV5vPi4L/TKYf1AmaZ5/JJuz5Ho9Sttv Qb9gmaQxiVZqs46VJEfCUzzvqtADVmP+IY6Hyhf2lpL7i31OFElivjzYYoaU U7+hFuzDkC9prRu8V6NIcmaKMJY06jSyF0O+ENsuubeR9vibcqmX8R77s5B0 j1OLKPa6XpLGg0bag0rWwKH6hR7VM//Qmx1ZW5IzUbQcMMkZKVp+sjTuogb0 S8gl3aNEsYNCkj5X2KdGlCQHrQaME5UJObeS+xltby+55mhr4LJ6zEHvtuRe gt8rdS8t5wqUSLIOWKwBVlO5/Z5LiFrvq6YkMZIg0r5+ite8Sa1rYXAOpiXp eRDNDiOp/RApBm9IreNJOQeHJfXHR9rTS/0vU7IlbG9vN5uDkcZOC0nWwGhx DhJfBHIIpyRpjnYJU3qX5UpaHyZSPIw0fyCS/0UiqV28hBY+enxf2BG9z2/J ePScJ7AqnEskYyaa/0WqVr0QW9QiX65dBZsSxrLHHPKprQdDdYkXoP/fVNUq nrtF3YFN7wv4kjAfPYxbiS3MQo+4WpKed6bgixhSqzjS2rVmU/VVPDxHyZrg uUbvsqS236nZYdapVU5FzXqXEt+u9V4LknNRJH9gqpf1gki2p1JJz8y51Myl T9WL81BzS2KL8bCWSyR9rzO/5lHSOnq51LrHqZojHuyIqXzVKPU/pPk5EfLj a6plPkWNPWJqDbH+PpXYvqLY5qXvcw82tJ5qWZO0Rv2TVG9k6/UOJHuzCLYY qa8LeQPU92rprwdj7TOp+nDWz/WoyTD0/T2cZ1OS9h7iHnSzJHbzMYyphZKy sVm2iUpi07z3TJL6ITy8LzXVoyZw6Twc8mFar3kgyRfwHuOeWucXRM6Nr6Fe fQxL+lcN5XhY38el3m24Ns+S1AIAkeJ/WqmVn3Ad8EXn7B+H8l6s53ym6obC 3uRV0jxUPD9Kpl5zMHdv4nkOpvI0vdoIc97Z3vfaPdWzBukC2MlSz2io9oZl O5vE3uw1F0taDwV1DCm5Un64liB2d9NcHDrzW14HJb55yzbdTZLW5Y/UG6OX etUfTa2Lq3GT+NmmP2/ZJpOyiXq0x0jHyJRzAseoR91DKRifeN5YG4d6lVv2 TaRszZbfH+uUMz54BixTL/9Ebazu51Kx5pbPsqvKGRuRetP0lpVe9rlYjb2I MgdzxoWHHBbL6tkXpiZWY7YjzMGc+WfZPuZF0lrk1rCau5SyyVgfs9K6vGBr a8vsmcCTvM5Bq/uf1Bpi2XaYWsNXYW31OvI6B6326fFaByc3ft9rnIFFeZ2D Vsey5HxtrYZMKlfT+vf3Ls9z0GJejCSv3Mo+Gu8L7I1z7jl9EPXleQ5a65GS U5tAWyW9Rzj/2ignH9oilpQTU1KSU1lDsKNI64AuY3HPEUXe56AlP2FubZDe dsXSugmW7nFEpWrpWsdKnH7JuwxnsR52pVyfwwL4/xgD2l45PlmrWIhbK+0f gNqcrWocYm+ca3NZAN+Phfs6BbXukd0DC7Vyx+ZC18qtR/6itN7LJqzZuqIL Oc/ac6gGmqpVlwdrIp5Hbj1qnNeQ81WjfwFtn/3VqidobzTrJ7ToMY46AqhL hs/GGom5gXMd/h+2FeRXSusLSoCtlLXodZQbI2EV2A+0NJTzD4bqU1nAa42p KCo9s1tEq3Z16nvB7mVxv4FaQrS76Et7HNSmdwypxK68GOdjbSW1wL4TthtK X17zd4fobdNLzavVfEHN+j2IIaC/3Za81pJJ0TOvJmUXWbc/hj8wdYasCfac XPdsqkfPFw1go++hsfV8sY9t1fsKnwtbMc97ttXzXdybHjHRkrOgRFij4N8b Yx+D7RX+Cvgv6GPwI9jztedKS1rvSVPxaahNnSvsU+ELxDsE/kGcJ1FPFXMM v2Lvi8/FORT/Ps4TnHM+FdEes0rrmtZDdYiBhRg6yq5KY+m9cXR01Owepnzv rPlADUmz10tvWqxHEntMq3wIKoa050Vvaq9Jqf5KHnu7UP3kPW+3lJp566mc Lw/1tCk9tYjz9wDOb7Vi2VL30ErtNMqmWvmFPVCrNnAqBlsrfpyyL8+1DGtR Y5+Yim+gTZTapCh582MZ67NI7SUYG01tUuT4tFzgnynV7u7u4GfTL0Gtk/da opbmYepzKWqdSuvvRSd3Hkr88xS1TjXqbkUFsZ9SpeYg/fPUOk3VL5+DNM8h tae32huR0tWU4kPHAHtnyo+fqifKOUitCrnU2mPbE9izD+Wfcw5SuaItpoxN /U44B6lcwUagPZ69si7ehXOQylGEni7arMZ+puYg6sJQ1EI1exJMGdR2WSh1 vtasu0/ZUoTegpZAfaWFzTT1ZykK4hrYBomvlaK4BrYlVReSMdvUlPN0LcDc pWkrah8JT+AZUNMVY7P16VFrn7KpqdTutU7vHmyUDUny2kgfGCszTaF3j/bY I4/07glM6Yp1KuxRs6YwZV/IP9Uec+R7cDanpiH6ImxCu8x0pD3WyHp69eSm dJWquU50YcxabLFOk314JoytVK1nos/x8bH2MKEaiXtQP1DxhJh87XFF5DB+ O5YQe5HKXSO24H40lvA8tccUyYdxazHE3oF+wbOjfAv1ZrXHESlnf39fewhR I0U/hH9Y38Kv0JdLe/yQ8aAHMuVPqPGsPXZIPTb1sKBsin7AeOT0GqV0dX9/ rz5eSBu4FvoQ6pFojxXSBmm/X0pPzImPD22kdnV+fq4+Pkh7WHfNpmgDnRaM nbEl1oSZJrC9Ufpij6TpQvuMvliPglxfX2sPw8nq9vZW/fkTG7AecH9x/pFl WAexr5iHRNZxcnKiPTQnIa5/ZAj4p6h2Ygw2kQA7OVVf9P+RHGAvp+qJ8S+k BOZX1BHjP8kY6LMYJ9Qu0H6GxD9cD/OFe7a3t6f+7EgceD6UC312tJ8XiQlr 5qcFH6v2cyKxof9wvbBPQKyR9vMh04D9K77XxcWF+jMh0wPv/KnbarD2oX65 9rMg0+bq6kp7KnQX+uecnp6q33tCFhwcHExmTWTfI2IZnIui9liDTXhnZ0f9 HhMiIVJePuYez3zEI1gz4Kv2Knx3xrmQKMBu8/DwoD2tkkJsLP0MJDLoN2Mt NxHvBuydYVfSvj+E9OTs7Gw+HzVsOLDhIt7n8PBQ/T4QYgHk91xeXs7rPLTY s8KXDp8C1mH2kyZExtHR0Tz3FXPn5uZmvmZi/UJd8Lu7u+9+xc8xf2HDxL4S fw/zjb4EQgghhBBCCCGEEEIIIYQQQgjJ578q61S7 "], {{0, 225}, {225, 0}}, {0, 255}, ColorFunction->RGBColor], BoxForm`ImageTag["Byte", ColorSpace -> "RGB", Interleaving -> True], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSize->Automatic, ImageSizeRaw->{225, 225}, PlotRange->{{0, 225}, {0, 225}}]\), ColorFunction -> "SolarColors", ImageSize -> 500]" width="574" height="571" />

With just a glimpse, I can see that he mainly wrote about his opponents, himself and boxing.

In my Community post from last June, I showed how to create the following DateListPlot that shows his victories over time. Note that his suspension period happened just as his performance was rising steeply:

I imported the other data from his Wikipedia page, which allowed me to visualize where these fights took place with GeoGraphics and who his opponents were:

Now as a continuation of that previous post, I would like to further analyze Ali’s opponents. For this, I’m going to take the data from the BoxRec.com site, where one can find a record of all of Ali’s opponents. I’m going to skip the parsing process of the relevant data imported from the HTMLs and will directly use a dataset that I created for this purpose (see the attached file at the end of this post).

First, let’s create a CommunityGraphPlot with all of Ali’s opponents. I want the vertexes of the graph to represent the boxers and the edges to indicate if two boxers encountered each other in the ring. Each community here will represent a group of boxers that are more connected to each other than the rest of boxers, and they will each be represented in a different color. For this, I need the list of opponents of each of Ali’s opponents:

In addition, I can indicate the number of bouts fought by each boxer by plotting the diameter of the vertexes proportionally and also indicate the losses that Ali had during his career with red edges using VertexSize and VertexLabels, respectively (see the complete code in the attached notebook):

#[[2]] &, DeleteDuplicates@ Map[Sort, Flatten[Table[ Map[{boxers[[i, 1]], #} &, Intersection[boxers[[i, 2]], boxersID]], {i, Length[boxers]}], 1]]], VertexLabels -> vertexlabels, VertexSize -> vertexsizes, EdgeStyle -> rededges, ImageSize -> 620]" title="CommunityGraphPlot[ Map[#[[1]] <-> #[[2]] &, DeleteDuplicates@ Map[Sort, Flatten[Table[ Map[{boxers[[i, 1]], #} &, Intersection[boxers[[i, 2]], boxersID]], {i, Length[boxers]}], 1]]], VertexLabels -> vertexlabels, VertexSize -> vertexsizes, EdgeStyle -> rededges, ImageSize -> 620]" width="614" height="177" />
#[[2]] &, DeleteDuplicates@ Map[Sort, Flatten[Table[ Map[{boxers[[i, 1]], #} &, Intersection[boxers[[i, 2]], boxersID]], {i, Length[boxers]}], 1]]], VertexLabels -> vertexlabels, VertexSize -> vertexsizes, EdgeStyle -> rededges, ImageSize -> 620]" title="CommunityGraphPlot[ Map[#[[1]] <-> #[[2]] &, DeleteDuplicates@ Map[Sort, Flatten[Table[ Map[{boxers[[i, 1]], #} &, Intersection[boxers[[i, 2]], boxersID]], {i, Length[boxers]}], 1]]], VertexLabels -> vertexlabels, VertexSize -> vertexsizes, EdgeStyle -> rededges, ImageSize -> 620]" width="620" height="456" />

We can observe that Moore had the largest number of bouts. But was he better than Ali in terms victories over losses?

One way to compare the boxers is by calculating the following ratio for each one:

I can then use a machine learning function such as FindClusters to classify the opponents into different categories, visualized here with a Histogram:

Map[Style[#, FontFamily -> "Helvetica Neue", FontSize -> 14] &, {"Wins-Losses Ratio", "Boxers"}], ChartLegends -> {"Great Boxers", "Good Boxers", "\"Bad\" Boxers"}, ImageSize -> 500]" title="wins = Values@Normal@dataset[All, "wins"]; losses = Values@Normal@dataset[All, "losses"]; draws = Values@Normal@dataset[All, "draws"]; Histogram[FindClusters[(wins - losses)/totalfights], {0.038}, AxesLabel -> Map[Style[#, FontFamily -> "Helvetica Neue", FontSize -> 14] &, {"Wins-Losses Ratio", "Boxers"}], ChartLegends -> {"Great Boxers", "Good Boxers", "\"Bad\" Boxers"}, ImageSize -> 500]" width="585" height="175" />
Map[Style[#, FontFamily -> "Helvetica Neue", FontSize -> 14] &, {"Wins-Losses Ratio", "Boxers"}], ChartLegends -> {"Great Boxers", "Good Boxers", "\"Bad\" Boxers"}, ImageSize -> 500]" title="wins = Values@Normal@dataset[All, "wins"]; losses = Values@Normal@dataset[All, "losses"]; draws = Values@Normal@dataset[All, "draws"]; Histogram[FindClusters[(wins - losses)/totalfights], {0.038}, AxesLabel -> Map[Style[#, FontFamily -> "Helvetica Neue", FontSize -> 14] &, {"Wins-Losses Ratio", "Boxers"}], ChartLegends -> {"Great Boxers", "Good Boxers", "\"Bad\" Boxers"}, ImageSize -> 500]" width="620" height="243" />

Another way to compare the opponents’ records is by plotting a BubbleChart:

"Helvetica Neue", FontSize -> 10, FontColor -> Black, Directive[Opacity[0.7]]], RandomChoice[{Top, Bottom, Left, Right, Center}]] &, {losses, wins, totalfights, namesFamily}]; bubblesClusters = Table[bubbles[[Flatten@Position[clusters, i]]], {i, 3}]; BubbleChart[bubblesClusters, PlotTheme -> "Detailed", AspectRatio -> 1/GoldenRatio, BubbleScale -> "Diameter", ChartBaseStyle -> Directive[Opacity[0.7]], ChartLegends -> Placed[{"Great Boxers", "Good Boxers", "\"Bad\" Boxers"}, Bottom], PlotLabel -> Style["Wins vs. Losses", Bold, FontFamily -> "Helvetica Neue", FontSize -> 18], FrameLabel -> {Style["Losses", Bold, FontFamily -> "Helvetica Neue", FontSize -> 18], Style["Wins", Bold, FontFamily -> "Helvetica Neue", FontSize -> 18]}, ImageSize -> 610]" title="bubbles = MapThread[ Labeled[List[#1, #2, #3], Style[#4, Bold, FontFamily -> "Helvetica Neue", FontSize -> 10, FontColor -> Black, Directive[Opacity[0.7]]], RandomChoice[{Top, Bottom, Left, Right, Center}]] &, {losses, wins, totalfights, namesFamily}]; bubblesClusters = Table[bubbles[[Flatten@Position[clusters, i]]], {i, 3}]; BubbleChart[bubblesClusters, PlotTheme -> "Detailed", AspectRatio -> 1/GoldenRatio, BubbleScale -> "Diameter", ChartBaseStyle -> Directive[Opacity[0.7]], ChartLegends -> Placed[{"Great Boxers", "Good Boxers", "\"Bad\" Boxers"}, Bottom], PlotLabel -> Style["Wins vs. Losses", Bold, FontFamily -> "Helvetica Neue", FontSize -> 18], FrameLabel -> {Style["Losses", Bold, FontFamily -> "Helvetica Neue", FontSize -> 18], Style["Wins", Bold, FontFamily -> "Helvetica Neue", FontSize -> 18]}, ImageSize -> 610]" width="611" height="353" c/>

Under such a classification method, Ali is one of the greatest (as I expected), but Moore is just a “good” boxer, even if he holds the record number of wins. Although this is a nice way to compare boxers, one should be cautious—for example, I noticed that Spinks is classified as a “bad” boxer even though he beat Ali once.

Before concluding the opponents analysis, I will plot Ali’s weight over his career and compare it with the one of his rivals with DateListPlot:

As one should expect, Ali gained weight over the course of his career. And he had one really heavy opponent, Buster Mathis, who weighed over 250 pounds at the end of his career.

Finally, I would like to point out a fun fact that I discovered thanks to the amazing amount of knowledge built into the Wolfram Language. After winning his first world heavyweight title in 1964, there was a little boom of babies named Cassius, who are now around 52 years old. There would probably be even more people called Cassius now if he hadn’t changed his name to Muhammad Ali:

{Style["Years", FontFamily -> "Helvetica Neue", FontSize -> 16], Style["Percentage", FontFamily -> "Helvetica Neue", FontSize -> 16]}, ImageSize -> 500]" title="ListLinePlot[ Partition[ Flatten@Entity["GivenName", {"Cassius", "UnitedStates", "male"}][ EntityProperty["GivenName", "GivenNameDistribution"]], 2], AxesLabel -> {Style["Years", FontFamily -> "Helvetica Neue", FontSize -> 16], Style["Percentage", FontFamily -> "Helvetica Neue", FontSize -> 16]}, ImageSize -> 500]" width="559" height="412" />

The Wolfram Language offers so many possibilities to keep exploring Ali’s life. But I will stop here and encourage you to create your own visualizations and share your ideas on Wolfram Community’s Ali thread.

Download this post as a Computable Document Format (CDF) file along with the accompanying dataset. (Note that you should save the dataset file in the same folder as the notebook in order to load the data needed for the visualizations.) New to CDF? Get your copy for free here.
]>

Automotive Reliability in the Wolfram Language

Nick Lariviere — Fri, 13 Jan 2017 17:39:38 +0000
This post originally appeared on Wolfram Community, where the conversation about reliable cars continues. Be sure to check out that conversation and more—we can’t wait to see what you come up with!

For the past couple of years, I’ve been playing with, collecting and analyzing data from used car auctions in my free time with an automotive journalist named Steve Lang to try and get an idea of what the used car market looks like in terms of long-term vehicle reliability. I figured it was about time that I showed off some of the ways that the Wolfram Language has allowed us to parse through information on over one million vehicles (and counting).

I’ll start off by saying that there isn’t anything terribly elaborate about the process we’re using to collect and analyze the information on these vehicles; it’s mostly a process of reading in reports from our data provider (and cleaning up the data), and then cross-referencing that data with various automotive APIs to get additional information. This data then gets dumped into a database that we use for our analysis, but having all of the tools we need built into the Wolfram Language makes the entire operation something that can be scripted—which greatly streamlines the process. I’ll have to skip over some of the details or this will be a very long post, but I’ll try to cover most of the key elements.

The data we get comes in from a third-party provider that manages used car auctions around the country (unfortunately, our licensing agreement doesn’t allow me to share the data right now), but it’s not very computable at first (the data comes in as a text file report once a week):

Fortunately, parsing this sort of log-like data into individual records is easy in the Wolfram Language using basic string patterns:

Then it’s mostly a matter of cleaning up the individual records into something more standardized (I’ll spare you some of the hacky details due to artifacts in the data feed). You’ll end up with something like the following:

"2017-01-02", "ModelYear" -> 1999, "Make" -> "Acura", "Model" -> "CL", "TransmissionIssue" -> True, "EngineIssue" -> False, "Miles" -> 131612, "VIN" -> "19UYA2256XL014922"|>;" title="record = <|"Date" -> "2017-01-02", "ModelYear" -> 1999, "Make" -> "Acura", "Model" -> "CL", "TransmissionIssue" -> True, "EngineIssue" -> False, "Miles" -> 131612, "VIN" -> "19UYA2256XL014922"|>;" width="483" height="92" />

From there, we use the handy Edmunds vehicle API to get more information on an individual vehicle using their VIN decoder:

vin <> "?fmt=json&api_key=" <> apikey ], "JSON"]" title="lookupVIN[vin_String] := ImportString[ URLFetch["https://api.edmunds.com/api/vehicle/v2/vins/" <> vin <> "?fmt=json&api_key=" <> apikey ], "JSON"]" width="620" height="145" />

vin <> "?fmt=json&api_key=" <> apikey ], "JSON"]" title="lookupVIN[vin_String] := ImportString[ URLFetch["https://api.edmunds.com/api/vehicle/v2/vins/" <> vin <> "?fmt=json&api_key=" <> apikey ], "JSON"]" width="620" height="311" />

We then insert the records into an HSQL database (conveniently included with Mathematica), resulting in an easy way to search for the records we want:

From there, we can take a quick look at metrics using larger datasets, such as the number of transmission issues for a given set of vehicles for different model years:

Or a histogram of those issues broken down by vehicle mileage:

It also lets us look at industry-wide trends, so we can develop a baseline for what the expected rate of defects for an average vehicle (or vehicle of a certain class) should be:

We can then compare a given vehicle to that model:

We then use that model, as well as other information, to generate a statistical index. We use that index to give vehicles an overall quality rating based on their historical reliability, which ranges from a score of 0 (chronic reliability issues) to 100 (exceptional reliability), with the industry average hovering right around 50:

We also use various gauges to put together informative visualizations of defect rates and the overall quality:

{.08, .1}, GaugeLabels -> { Placed[ Style[Row[{"Model average: ", AccountingForm[mileage, DigitBlock -> 3], " miles"}], FontSize -> 20], Above], Placed[ Style[Row[{"Industry average: ", AccountingForm[$IndustryAverageMileage, DigitBlock -> 3], " miles"}], FontSize -> 16], Below] }, ScaleRanges -> {If[ mileage < $IndustryAverageMileage, {mileage, \ \ $IndustryAverageMileage}, {$IndustryAverageMileage, mileage}]}, ScaleRangeStyle -> color, GaugeStyle -> {Darker[Red], Black}, ImageSize -> 500, ScaleDivisions -> {7, 7}, GaugeFaceStyle -> Lighter[color, .8], opts] ]" title="MileageGauge[mileage_, opts___] := With[{color = Which[ mileage <= 100000, Lighter[Red], 100000 <= mileage <= 120000, Lighter[Yellow], 120000 <= mileage <= 130000, Lighter[Blue], True, Lighter[Green]]}, HorizontalGauge[{mileage, $IndustryAverageMileage}, {50000, 200000}, ScalePadding -> {.08, .1}, GaugeLabels -> { Placed[ Style[Row[{"Model average: ", AccountingForm[mileage, DigitBlock -> 3], " miles"}], FontSize -> 20], Above], Placed[ Style[Row[{"Industry average: ", AccountingForm[$IndustryAverageMileage, DigitBlock -> 3], " miles"}], FontSize -> 16], Below] }, ScaleRanges -> {If[ mileage < $IndustryAverageMileage, {mileage, \ \ $IndustryAverageMileage}, {$IndustryAverageMileage, mileage}]}, ScaleRangeStyle -> color, GaugeStyle -> {Darker[Red], Black}, ImageSize -> 500, ScaleDivisions -> {7, 7}, GaugeFaceStyle -> Lighter[color, .8], opts] ]" width="544" height="761" />

Style[ToString[N[value, 3]*100] <> "%", 15], PlotLabel -> Style["Transmission Issues", 15], ScaleRanges -> {{0, $IndustryAverageIssueRates - .01} -> Lighter[Green], {{$IndustryAverageIssueRates - .01, $IndustryAverageIssueRates + .01}, {0, .2}}, \ \ {$IndustryAverageIssueRates + .01, 1.5*$IndustryAverageIssueRates} -> Lighter[Yellow], {1.5*$IndustryAverageIssueRates, 1} -> Lighter[Red]}, GaugeStyle -> {RGBColor[{.15, .4, .6}], RGBColor[{.5, .5, .5}]}]" title="announcementGauge[value_] := AngularGauge[value, {0, .3}, GaugeLabels -> Style[ToString[N[value, 3]*100] <> "%", 15], PlotLabel -> Style["Transmission Issues", 15], ScaleRanges -> {{0, $IndustryAverageIssueRates - .01} -> Lighter[Green], {{$IndustryAverageIssueRates - .01, $IndustryAverageIssueRates + .01}, {0, .2}}, \ \ {$IndustryAverageIssueRates + .01, 1.5*$IndustryAverageIssueRates} -> Lighter[Yellow], {1.5*$IndustryAverageIssueRates, 1} -> Lighter[Red]}, GaugeStyle -> {RGBColor[{.15, .4, .6}], RGBColor[{.5, .5, .5}]}]" width="441" height="506" />

There is a lot more we do to pull all of this together (like the Wolfram Language templating we use to generate the HTML pages and reports), and honestly, there is a whole lot more we could do (my background in statistics is pretty limited, so most of this is pretty rudimentary, and I’m sure others here may already have ideas for improvements in presentation for some of this data). If you’d like to take a look at the site, it’s freely available (Steve has a nice introduction to the site here, and he also writes articles for the page related to practical uses for our findings).

Our original site was called the Long-Term Quality Index, which is still live but showed off my lack of experience in HTML development, so we recently rolled out our newer, WordPress-based venture Dashboard Light, which also includes insights from our auto journalist on his experiences running an independent, used car dealership.

This is essentially a two-man project that Steve and I handle in our (limited) free time, and we’re still getting a handle on presenting the data in a useful way, so if anyone has any suggestions or questions about our methodology, feel free to reach out to us.

Cheers!

Continue the conversation at Wolfram Community.
]>

Recent Wolfram Technology Books

John Moore — Mon, 09 Jan 2017 17:48:35 +0000
We’re always excited to see new books that explore new ways to use Wolfram technologies. Authors continue to find inventive ways to think with the Wolfram Language. A variety of new Wolfram technology books have been published over the past few months. We hope that you’ll find something on this list to support your new year’s resolution to upgrade your skills. (Update: also look for the newly released Chinese translation of Stephen Wolfram’s An Elementary Introduction to the Wolfram Language.)

Toolbox for the Mathematica Programmers

This new guide from Viktor Aladjev and V. A. Vaganov outlines a modular approach to programming with the Wolfram Language. Providing over 800 tools that can be incorporated into a variety of projects, Toolbox for the Mathematica Programmers will be useful for students and seasoned programmers alike.

Option Valuation under Stochastic Volatility II: With Mathematica Code

In this second volume of his series about quantitative finance, Alan L. Lewis’s Option Valuation under Stochastic Volatility II: With Mathematica Code expands his original focus to include jump diffusions. The finance industry is increasingly relying on computational analysis to model risk and track customer data. Lewis’s volume is a welcome addition to the literature of the field, of interest for both researchers and investors/traders looking to learn more about computational thinking. Topics covered include spectral theory for jump diffusions, boundary behavior for short-term interest rate models, modeling VIX options, inference theory and discrete dividends.

CRC Standard Curves and Surfaces with Mathematica

The third edition of the popular CRC Standard Curves and Surfaces with Mathematica is an indispensable reference text for anyone who works with curves and surfaces, from engineers to graphic designers. With new illustrations in almost every chapter, the updated version contains nearly 1,000 visualizations, depicting nearly every geometrical figure used today. It also includes a CD with a series of interactive Computable Document Format (CDF) files.

Butterworth & Bessel Filters

T. D. McGlone provides a useful introduction to Butterworth and Bessel (aka Thomson) filter functions. With an overview of mathematical functions, topology choices and component selection based on sensitivity criteria, Butterworth & Bessel Filters will be particularly useful for engineers.

Automation of Finite Element Methods

Another text for engineers, Automation of Finite Element Methods provides an introduction to developing virtual prediction techniques. New finite elements need to be created for individual purposes, which can be time-consuming. Authors Jože Korelc and Peter Wriggers outline an approach to automating this process through Wolfram Language programming.

Computational Proximity: Excursions in the Topology of Digital Images

Based on James F. Peters’s popular graduate course on the topology of digital images, Computational Proximity: Excursions in the Topology of Digital Images introduces the concept of computational proximity as an algorithmic approach to finding nonempty sets of points that are either close to each other or far apart. Peters discusses the applications of this concept in computer vision, multimedia, brain activity, biology, social networks and cosmology.

Now available as well is the Chinese translation of Stephen Wolfram’s An Elementary Introduction to the Wolfram Language: Wolfram 语言入门. The translated edition includes all of the material that made the English edition popular with anyone wanting to learn to program in the Wolfram Language. Look out for translations into additional languages in the future!

]>

Our Readers’ Favorite Stories from 2016

John Moore — Tue, 03 Jan 2017 18:14:42 +0000

It’s been a busy year here at the Wolfram Blog. We’ve written about ways to avoid the UK’s most unhygienic foods, exciting new developments in mathematics and even how you can become a better Pokémon GO player. Here are some of our most popular stories from the year.

Today We Launch Version 11!

In August, we launched Version 11 of Mathematica and the Wolfram Language. The result of two years of development, Version 11 includes exciting new functionality like the expanded map generation enabled by satellite images. Here’s what Wolfram CEO Stephen Wolfram had to say about the new release in his blog post:

OK, so what’s the big new thing in Version 11? Well, it’s not one big thing; it’s many big things. To give a sense of scale, there are 555 completely new functions that we’re adding in Version 11—representing a huge amount of new functionality (by comparison, Version 1 had a total of 551 functions altogether). And actually that function count is even an underrepresentation—because it doesn’t include the vast deepening of many existing functions.

Finding the Most Unhygienic Food in the UK

Using the Wolfram Language, John McLoone analyzes government data about food safety inspections to create visualizations of the most unhygienic food in the UK. The post is a treasure trove of maps and charts of food establishments that should be avoided at all costs, and includes McLoone’s greatest tip for food safety: “If you really care about food hygiene, then the best advice is probably just to never be rude to the waiter until after you have gotten your food!”

Finding Pokémon GO’s Shortest Tour to Compute ’em All!

Bernat Espigulé-Pons creates visualizations of Pokémon across multiple generations of the game and then uses WikipediaData, GeoDistance and FindShortestTour to create a map to local Pokémon GO gyms. If you’re a 90s kid or an avid gamer, Espigulé-Pons’s Pokémon genealogy is perfect gamer geek joy. If you’re not, this post might just help to explain what all those crowds were doing in your neighborhood park earlier this year.

Behind Wolfram|Alpha’s Mathematical Induction-Based Proof Generator

Connor Flood writes about creating “the world’s first online syntax-free proof generator using induction,” which he designed using Wolfram|Alpha. With a detailed explanation of the origin of the concept and its creation from development to prototyping, this post provides a glimpse into the ways that computational thinking applications are created.

An Exact Value for the Planck Constant: Why Reaching It Took 100 Years

Wolfram|Alpha Chief Scientist Michael Trott returns with a post about the history of the discovery of the exact value of the Planck constant, covering everything from the base elements of superheroes to the redefinition of the kilogram.

Launching the Wolfram Open Cloud: Open Access to the Wolfram Language

In January of 2016, we launched the Wolfram Open Cloud to—as Stephen Wolfram says in his blog post about the launch—“let anyone in the world use the Wolfram Language—and do sophisticated knowledge-based programming—free on the web.” You can read more about this integrated cloud-based computing platform in his January post.

On the Detection of Gravitational Waves by LIGO

In February, the Laser Interferometer Gravitational-Wave Observatory (LIGO) announced that it had confirmed the first detection of a gravitational wave. Wolfram software engineer Jason Grigsby explains what gravitational waves are and why the detection of them by LIGO is such an exciting landmark in experimental physics.

Computational Stippling: Can Machines Do as Well as Humans?

Silvia Hao uses Mathematica to recreate the renaissance engraving technique of stippling: a kind of drawing style using only points to mimic lines, edges and grayscale. Her post is filled with intriguing illustrations and is a wonderful example of the intersection of math and illustration/drawing.

Newest Wolfram Technologies Books Cover Range of STEM Topics

In April, we reported on new books that use Wolfram technology to explore a variety of STEM topics, from data analysis to engineering. With resources for teachers, researchers and industry professionals and books written in English, Japanese and Spanish, there’s a lot of Wolfram reading to catch up on!

Announcing Wolfram Programming Lab

The year 2016 also saw the launch of Wolfram Programming Lab, an interactive online platform for learning to program in the Wolfram Language. Programming Lab includes a digital version of Stephen Wolfram’s 2016 book, An Elementary Introduction to the Wolfram Language, as well as Explorations for programmers already familiar with other languages and numerous examples for those who learn best by experimentation.
]>

Gardening à la Gardner

Kathryn Cramer — Wed, 28 Dec 2016 17:45:38 +0000
When looking through the posts on Wolfram Community, the last thing I expected was to find exciting gardening ideas.

The general idea of Ed Pegg’s tribute post honoring Martin Gardner, “Extreme Orchards for Gardner,” is to find patterns for planting trees in configurations with constraints like “25 trees to get 18 lines, each having 5 trees.” Most of the configurations look like ridiculous ideas of how to plant actual trees. For example:

I have a seven-acre apple orchard with 200+ trees in New York’s Adirondack Park, and so I read “Extreme Orchards for Gardner” as a gardener first. Of course, Pegg’s post was never intended as a proposal for how to plant actual orchards, but as I live in the middle of an orchard, I can’t help wondering, what if you did plant orchards this way?

When considering this as an actual planting pattern, we should borrow that character ubiquitous in physics: the observer. To the observer on the ground, only the center cluster would look much like an orchard; the trees at the vertices would appear to have nothing much to do with the rest.

One of my favorite physics jokes is the one about the theoretical physicist who loses his job as a professor and has to go to work as a milkman. (Once upon a time, milk was delivered to people’s houses by “milkmen.”) After a few weeks on the job, the physicist just can’t stand not being able to give lectures. So he assembles his colleagues in front of a blackboard, draws a circle on the board and begins by saying, “Consider a spherical cow of uniform density.” The representation of orchards by Martin Gardner, Branko Grünbaum and such in the usual rendition of the orchard planting problem is to real orchards as spherical cows are to the animals who produce the milk you drink. So, to some extent, the fact that trees are not points and need a certain spacing is an unfair criticism. Nonetheless, since every way I look out my windows I see real apple trees, I feel compelled to point this out. (I think Grünbaum, who was my professor many years ago and who encouraged us to reality-test our mathematical ideas, would approve.)

This is even more true for this configuration involving rows of six “trees.” Just how much land would it take to plant an orchard like this using real trees? No one would do this.

Pegg also shows some more possible configurations—like these, in which the lines pass through exactly four trees each. For actual, rather than hypothetical, trees, some of these look a bit more workable.

My own apple trees, planted in the mid-1980s, are planted in rows, which is practical if a bit boring.

There are pragmatic constraints involved in planting apple trees. The orchardist needs access to the trees from two sides, both for maintenance (pruning, spraying, etc.) and to harvest apples. Assuming semi-dwarf trees, this involves aisles with a minimum width of about 22 feet (ca. 6.7 meters), starting from the center of each trunk. The trees should be planted no closer than intervals of 16 feet (ca. 4.9 meters) to give them enough air and light.

Only configurations in which there is a small variation in the segments connecting trees could realistically be planted as something that would, on the ground, resemble an orchard. Most of the configurations would require an enormous amount of land and so are mostly mathematical abstractions rather than something one could really implement.

But the configuration on the lower left in Pegg’s four-tree grouping looks like something one could actually plant. Like so:

One advantage I see in the configurations with a small variation in segment length is that planting a portion of the orchard as pentagons within pentagons reduces the amount of grass under the trees to be maintained, thus significantly reducing mowing and therefore labor and gasoline costs. So it is not completely foolish to consider planting at least a small orchard this way.

I am attracted to the 25-tree pentagon configuration because of its empty center circle, creating a private grove space. Taking into account an air gap around the outside, my guess is that a circle in the field of about 125 feet in diameter should be big enough. That center circle could, for example, hold a very nice circle of wildflowers 20 feet across for bee forage, maybe some beehives in the center, and still leave room for equipment to navigate.

Another advantage: this would be a good layout for planting five types of trees in groups of five. They could then be easily identified in their mini-groves and harvested together. The more I thought about it, the more this became something I might actually want to do. I started shopping online for heritage varieties of apple trees, looking around at my farm for the right place to put the new trees, imagining new designs…. Hmm.

Pegg, on the other hand, is more concerned with finding new solutions to the abstract version of the orchard problem, which are indeed quite beautiful, if impractical for the planting of trees:

These contemplations make me want to go deeper into mathematical patterns to see what else might be plantable. Maybe this last “orchard” plot might work with bulbs.
]>

The Semantic Representation of Pure Mathematics

Eric Weisstein — Thu, 22 Dec 2016 17:48:36 +0000

Introduction

Building on thirty years of research, development and use throughout the world, Mathematica and the Wolfram Language continue to be both designed for the long term and extremely successful in doing computational mathematics. The nearly 6,000 symbols built into the Wolfram Language as of 2016 allow a huge variety of computational objects to be represented and manipulated—from special functions to graphics to geometric regions. In addition, the Wolfram Knowledgebase and its associated entity framework allow hundreds of concrete “things” (e.g. people, cities, foods and planets) to be expressed, manipulated and computed with.

Despite a rapidly and ever-increasing number of domains known to the Wolfram Language, many knowledge domains still await computational representation. In his blog “Computational Knowledge and the Future of Pure Mathematics,” Stephen Wolfram presented a grand vision for the representation of abstract mathematics, known variously as the Computable Archive of Mathematics or Mathematics Heritage Project (MHP). The eventual goal of this project is no less than to render all of the approximately 100 million pages of peer-reviewed research mathematics published over the last several centuries into a computer-readable form.

In today’s blog, we give a glimpse into the future of that vision based on two projects involving the semantic representation of abstract mathematics. By way of further background and motivation for this work, we first briefly discuss an international workshop dedicated to the semantic representation of mathematical knowledge, which took place earlier this year. Next, we present our work on representing the abstract mathematical concepts of function spaces and topological spaces. Finally, we showcase some experimental work on representing the concepts and theorems of general topology in the Wolfram Language.

The Semantic Representation of Mathematical Knowledge Workshop

In February 2016, the Wolfram Foundation, together with the Fields Institute and the IMU/CEIC working group for the creation of a Global Digital Mathematics Library, organized a Semantic Representation of Mathematical Knowledge Workshop designed to pool the knowledge and experience of a small and select group of experts in order to produce agreement on a forward path toward the semantic encoding of all mathematics. This workshop was sponsored by the Alfred P. Sloan Foundation and held at the Fields Institute in Toronto. The workshop included approximately forty participants who met for three days of talks and discussions. Participants included specialists from various fields, including:

computer algebra

interactive and automatic proof systems

mathematical knowledge representation

foundations of mathematics

practicing pure mathematics

Among the many accomplished and knowledgeable participants (a complete list of whom, together with the complete schedule of events, may be viewed on the workshop website), Georges Gonthier and Tom Hales shared their experience on the world’s largest extant formal proofs (the Feit–Thompson odd order theorem and the Kepler conjecture, respectively); Harvey Friedman, Dana Scott and Yuri Matiyasevich brought expertise on mathematical foundations, incompleteness and undecidability; Jeremy Avigad and John Harrison shared their knowledge and experience in designing and implementing two of the world’s most powerful theorem provers; Bruno Buchberger and Wieb Bosma contributed extensive knowledge on computational mathematics; Fields Medal winners Stanislav Smirnov and Manjul Bhargava expounded on the needs of practicing mathematicians; and Ingrid Daubechies and Stephen Wolfram shared their thoughts and knowledge on many technical and organizational challenges of the problem as a whole.

As one might imagine, the list of topics discussed at the workshop was quite extensive. In particular, it included type theory, the calculus of constructions, homotopy type theory, mathematical vernacular, partial functions and proof representations, together with many more. The following word cloud, compiled from the text of hundreds of publications by the workshop participants, gives a glimpse of the main topics:

Recordings of workshop presentations can be viewed on the workshop video archive, and a white paper discussing the workshop’s outcomes is also available. In addition, because of the often under-emphasized yet vital importance of the subject for the future development (and practice) of mathematics in the coming decades, 18 participants were interviewed on the technological and scientific needs for achieving such a project, culminating in a 90-minute video (excerpts also available in a 9-minute condensed version) that highlights the visions and thoughts of some of the world’s most important practitioners. We thank filmmaker Amy Young for volunteering her time and talents in the compilation and production of this unique glimpse into the thoughts of renowned mathematicians and computer scientists from around the world, which we sincerely hope other viewers will find as inspiring and enlightening as we do.

Computational Encoding of Function Spaces

The eCF project encoded continued fraction terminology, theorems, literature and identities in computational form, demonstrating that Wolfram|Alpha and the Wolfram Language provide a powerful framework for representing, exposing and manipulating mathematical knowledge.

While the theory of continued fractions contains both high-level and abstract mathematics, it represents only a tiny first step toward Stephen Wolfram’s grand vision for computational access to all of mathematics and the dynamic use of mathematical knowledge. Our next step down this challenging path therefore sought to encode within the Wolfram Language and Wolfram|Alpha entity-property framework a domain of more abstract and inhomogeneous mathematical objects having nontrivial properties and relations. The domain chosen for this next step was the important and fairly abstract branch of mathematics known as functional analysis.

That step posed a number of new challenges, among them the need for graduate-level mathematical knowledge in the domain of interest, formulation of entity names that “naturally” contain parameters and encode additional information (say, measure spaces) and the introduction of stub extensions to the Wolfram Language.

Work was carried out from December 2014–July 2016 and consisted of knowledge curation in three interconnected knowledge domains: "FunctionSpace", "TopologicalSpaceType" and "FunctionalAnalysisSource", together with the development of framework extensions to support them. This functionality was recently made available through the Wolfram Language entity framework and consists of the following content:

126 function spaces (many parametrized); 45 properties

39 topological space types; 14 properties

147 functional analysis sources; 49 properties

Full availability on the Wolfram|Alpha website is expected by early January 2017.

Function Spaces

Two underlying concepts in functional analysis are those of the function space and the topological space. A function space is a set of functions of a given kind from one set to another. Common examples of function spaces include L^p spaces (Lebesgue spaces; defined using a natural generalization of the p-norm for finite-dimensional vector spaces) and C^k spaces (consisting of functions whose derivatives exist and are continuous up to k^th order).

As a simple first example in accessing this functionality, we can use RandomEntity to return a sample list of function spaces:

Similarly, EntityValue can be used to access curated properties for a given space:

All, Background -> {Automatic, {{LightBlue, None}}}, BaseStyle -> 8, ItemSize -> {{13, 76}, Automatic}]" title="TextGrid[With[{props = {alternate names, associated people, Bessel inequality, Cauchy-Schwarz inequality, classes, classifications, dual space, inner product, isomorphic spaces, measure space, norm, related results, isomorphic spaces, measure space, norm, related results, relationship graph, timeline, triangle inequality, typeset description}}, Transpose[{props, EntityValue[Lebesgue space L2(Rn, dxn) function space, props]}]],Dividers -> All, Background -> {Automatic, {{LightBlue, None}}}, BaseStyle -> 8, ItemSize -> {{13, 76}, Automatic}]" width="620" height="296" class="alignnone size-full wp-image-34262" />

As can be seen in various properties in this table, some mathematical representations required the introduction of new symbols not (yet) present in the Wolfram Language. This was accomplished by introducing them into a special PureMath` context. For example, after evaluating the above table, the following “pure math extension symbols” appear:

For now, these constructs are just representational. However, they are not merely placeholders for mathematical concepts/computational structures, but also have the benefit of enhancing human readability by automatically adding traditional mathematical typesetting and annotations. This can be seen, for example, by comparing the raw semantic expressions in the table above with those displayed on the Wolfram|Alpha website:

In the longer term, many such concepts may be instantiated in the Wolfram Language itself. As a result, both this and any similar semantic projects to follow will help guide the inclusion and implementation of computational mathematical functionality within Mathematica and the Wolfram Language.

A slightly more involved example demonstrates how the entity framework can be used to construct programmatic queries. Here, we obtain a list of all curated function spaces associated with mathematician David Hilbert:

ContainsAny[{Entity["Person", "DavidHilbert::8r974"]}]]]" title="EntityList[ Entity["FunctionSpace", EntityProperty["FunctionSpace", "AssociatedPeople"] -> ContainsAny[{Entity["Person", "DavidHilbert::8r974"]}]]]" width="620" height="179" class="alignnone size-full wp-image-34271" />

One interesting property from the table above that warrants a bit more scrutiny is "RelationshipGraph". This consists of a hierarchical directed graph connecting all curated topological space types, where nodes A and B are connected by a directed edge A↦B if and only if “S is a topological space of type A” implies “S is a topological space of type B”, and with the additional constraint that all nodes are connected only via paths maximizing the number of intermediate nodes. For each function space, this graph also indicates (in red) topological space types to which a given space belongs. For example, the Lebesgue space L² has the following relationship graph:

Here we show a similar graph in a slightly more streamlined and schematic form:

This graph corresponds to the following topological space type memberships:

While portions of this graph appear in the literature, the above graph represents, to our knowledge, the most complete synthesis of the hierarchical structure of topological vector spaces available. (The preceding notwithstanding, it is important to keep in mind that the detailed structure depends on the detailed conventions adopted in the definitions of various topological spaces—conventions that are not uniform across the literature.) A number of interesting facts can be gleaned from the graph. In particular, it can immediately be seen that the well-known Hilbert and Banach spaces (which have high-level structural properties whose relaxations lead to more general spaces) fall at the top of the hierarchical heap together with “inner product space.” On the other hand, topological vector spaces are the “most generic” types in some heuristic sense.

During the curation process, we have taken great care that function space properties are correct for all parameter values. This can be illustrated using code like the following to generate a tab view of Lebesgue spaces for various values of its parameter p and noting how properties adjust accordingly:

Grid[Transpose[{props, EntityValue[space, props]}], Dividers -> All, Alignment -> {Left, Center}], {space, {lebesgue, lebesgue /. \[FormalP] -> 1/3, lebesgue /. \[FormalP] -> 3, lebesgue /. \[FormalP] -> "Infinity"}}]]]" title="With[{props = {"Norm", "TypesetDescription", "Classifications"}, lebesgue = Entity["FunctionSpace", {{"LebesgueL", {{"Reals", \[FormalN]}, \ {"LebesgueMeasure", \[FormalN]}}}, \[FormalP]}]}, TabView[Table[ space -> Grid[Transpose[{props, EntityValue[space, props]}], Dividers -> All, Alignment -> {Left, Center}], {space, {lebesgue, lebesgue /. \[FormalP] -> 1/3, lebesgue /. \[FormalP] -> 3, lebesgue /. \[FormalP] -> "Infinity"}}]]]" title="With[{props = {"Norm", "TypesetDescription", "Classifications"}, lebesgue = Entity["FunctionSpace", {{"LebesgueL", {{"Reals", \[FormalN]}, \ {"LebesgueMeasure", \[FormalN]}}}, \[FormalP]}]}, TabView[Table[ space -> Grid[Transpose[{props, EntityValue[space, props]}], Dividers -> All, Alignment -> {Left, Center}], {space, {lebesgue, lebesgue /. \[FormalP] -> 1/3, lebesgue /. \[FormalP] -> 3, lebesgue /. \[FormalP] -> "Infinity"}}]]]" title="With[{props = {"Norm", "TypesetDescription", "Classifications"}, lebesgue = Entity["FunctionSpace", {{"LebesgueL", {{"Reals", \[FormalN]}, \ {"LebesgueMeasure", \[FormalN]}}}, \[FormalP]}]}, TabView[Table[ space -> Grid[Transpose[{props, EntityValue[space, props]}], Dividers -> All, Alignment -> {Left, Center}], {space, {lebesgue, lebesgue /. \[FormalP] -> 1/3, lebesgue /. \[FormalP] -> 3, lebesgue /. \[FormalP] -> "Infinity"}}]]]" src="http://blog.wolfram.com/data/uploads/2016/12/7_PureMath.png" />

One of the beautiful things about computational encoding (and part of the reason it is so desirable for mathematics as a whole) is that known results can be easily tested or verified. (Similarly, and maybe even more importantly, new propositions can be easily formulated and explored.) As an example, consider the duality of Lebesgue spaces L^p and L^q for 1/p+1/q=1 with p≥1. First, define a variable to represent the L^p entity:

Now, use the "DualSpace" property (which may be specified either as a string or via a fully qualified EntityProperty["FunctionSpace", "DualSpace"] object, the latter of which may be given directly in that form or the corresponding formatted form ) to obtain the dual entity:

As can be seen, this formulation allows computation to be performed and expressed through the elegant paradigm of symbolic transformation of the entity canonical name. Taking the dual space of L^q in turn then gives:

Finally, applying symbolic simplification to the entity canonical name:

This verifies we have obtained the same space we originally started with:

In other words, that the double dual (L^p)^**, where * denotes the dual space, is equivalent to L^p. (Function spaces with this property are said to be reflexive.)

It is also important to emphasize that the curation of the existing literature on function spaces is not always straightforward, as illustrated in particular by the myriad of (mutually conflicting) conventions used for the interrelated collection of function spaces known as Campanato–Morrey spaces:

e]" title="cm = Cases[{#, Or @@ (StringMatchQ[ Cases[CanonicalName[#], _String, \[Infinity]], "*Campanato*" | "*Morrey*"])} & /@ EntityList["FunctionSpace"], {e_, True} :> e]" width="620" height="312" class="alignnone size-full wp-image-34285" />

This challenge is made clear with the following table, whose creation required a meticulous study of the literature:

As a result of multiple conventions, we chose in cases like this to include multiple, separate entities that are equivalent under appropriate (but possibly nontrivial) transformations of parameters and notations. For example:

All, ItemSize -> {{13, 68}, Automatic}, BaseStyle -> 10]" title="Grid[Transpose[ Function[r, {cm[[r]], EntityValue[cm[[r]], "TypesetDescription"]}]@{4, 5}], Dividers -> All, ItemSize -> {{13, 68}, Automatic}, BaseStyle -> 10]" width="620" height="392" class="alignnone size-full wp-image-34287" />

Topological Space Types

A topological space may be defined as a set of points and neighborhoods for each point satisfying a set of axioms relating the points and neighborhoods. The definition of a topological space relies only upon set theory and is the most general notion of a mathematical space that allows for the definition of concepts such as continuity, connectedness and convergence. Other spaces, such as manifolds and metric spaces, are specializations of topological spaces with extra structures or constraints. Common examples of topological vector spaces include the Banach space (a complete normed vector space) and the Hilbert space (an abstract Banach space possessing the structure of an inner product that allows length and angle to be measured). Topological spaces could be considered more abstract than function spaces (e.g. they are typically defined based on the existence of a norm as opposed to having a definite value for their norm). Being so general, topological spaces are a central unifying notion and appear in virtually every branch of modern mathematics. The branch of mathematics that studies topological spaces in their own right is called point-set topology or general topology.

EntityList can be used to see a complete list of curated topological space types:

Similarly, EntityValue[space type, "PropertyAssociation"] returns all curated properties for a given space:

All, Background -> {Automatic, {{LightBlue, None}}}, BaseStyle -> 8, ItemSize -> {{12, 77}, Automatic}]" title="TextGrid[List @@@ Normal[DeleteMissing[ EntityValue[Entity["TopologicalSpaceType", "HilbertSpace"], "PropertyAssociation"]]], Dividers -> All, Background -> {Automatic, {{LightBlue, None}}}, BaseStyle -> 8, ItemSize -> {{12, 77}, Automatic}]" width="620" height="137" class="alignnone size-full wp-image-34290" />

While more could be said and done with topological space types, in this project this domain was primarily used as a convenient way to classify function spaces. However, as the second project to be discussed in this blog will show, additional exploratory work is currently being done that could result in the augmentation of the human- (but not computer-) readable descriptions of topological spaces with semantically encoded versions potentially even suitable for use with automated proof assistants or theorem provers.

Functional Analysis Sources

A final component added in this project was a set of cross-linked literature references that provide provenance and documentation for the various conventions (definitions etc.) adopted in our curated functional analysis datasets. These references can be searched based on the journal in which a paper appears, the year or decade it was published, the author or the language in which it was written:

functional analysis papers in Crelle’s journal

functional analysis papers from the 1940s

functional analysis works by Grothendieck

Besov’s 1959 functional analysis paper

functional analysis papers in Italian

For mathematicians who wish to explore the source of the data down to the page (theorem etc.) level, this information has also been encoded:

Finally, we can use this detailed reference information in a way that provides a convenient overview of both existing notational conventions and those we adopted in this project:

Encoding of Concepts and Theorems from Topology

The second project we discuss in this blog is the not-unrelated augmentation of the Wolfram Language to precisely represent the definitions of mathematical concepts, statements and proofs in the field of point-set topology. This was done by creating an “entity store” for general topology consisting of concepts and theorems curated from the second edition of James Munkres’s popular Topology textbook. Although this project did not construct an explicit proof language (suitable, say, for use by a proof assistant or automated theorem prover), it did result in the comprehensive representation of 216 concepts and 225 theorems from a standard mathematical text, which is a prelude to any work involving machine proof.

EntityStore is a function introduced in Version 11 of the Wolfram Language that allows custom entity-property data to be packaged, placed in the cloud via the Wolfram Data Repository and then conveniently loaded and used. To load and use the general topology entity store, first access it via its ResourceData handle, then make it available in the Wolfram Language by prepending it to the list of known entity stores contained in the global $EntityStores variable:

As can be seen in the output, a nice summary blob shows the contents of the registered stores (in this case, a list containing the single store we just registered), including the counts of entities and properties in each of its constituent domains. Now that the entity store is registered, the custom entities it contains can be used within the Wolfram Language entity framework just as if they were built in. For example:

Similarly, we can see a full list of currently supported properties for topological theorems using EntityValue:

Before proceeding, we perform a little context path manipulation to make output symbols format more concisely (slightly deferring a discussion of why we do this until the end of this section):

A nice summary table can now be generated to show basic information about a given theorem:

"InputFormSummaryGrid" displays the same information as "SummaryGrid", but without applying the formatting rules we’ve used to make the concepts and theorems easily readable. It’s a good way to see the exact internal representation of the data associated with the entity. This can help us to understand what is going on when the formatting rules obscure this structure:

While it’s pretty straightforward to understand the mathematical assertion being made here, let’s look at each property in detail. Here, for example, is the display name (“label”) used for the entity representing the above theorem in the entity store, formatted using InputForm to display quotes explicitly and thus emphasize that the label is a string:

Similarly, here are alternate ways of referring to the theorem:

… the universally quantified variables appearing at the top level of the theorem statement (i.e. these are the variables representing the objects that the theorem is “about”):

… the conditions these objects must satisfy in order for the theorem to apply:

… and the conclusion of the theorem:

Of course, we could have just as easily listed Math["IsHausdorff"][Χ] as a restriction to this theorem and Math["IsT1"][Χ] as the statement since the manner in which the hypotheses are split between "Restrictions" and "Statement" is not unique. However, while the details of the splitting are subject to style and readability, the mathematical content of the theorem as expressed through any of these subjective choices is equivalent.

Finally, we can retrieve metadata about the source from which the theorem was curated:

Now, backing up a bit, you may well wonder about expressions with structures such as Category[...] and Math[...] that you’ve seen above. Let’s take a look at one of them, but this time through a general topology concept instead of a theorem:

Some of these properties are shared with corresponding properties for theorems:

You can see the common properties by intersecting the full lists of supported properties for concepts and theorems:

While properties are similar across topology theorems and concepts, there are some differences that should be addressed. "Arguments" for a concept takes the role of "QualifyingObjects" for a theorem. Just as theorems are thought of as applying to certain objects, concepts are thought of as functions that can be applied to certain objects. The output can be a Boolean value, as in this case. We would call such a concept a property or a predicate. Other concepts represent mathematical structures. For example, Math["MetricTopology"] takes a metric space as an argument and outputs the corresponding topology induced by the metric. The entity that corresponds to this math concept is .

A "Restrictions" property for concepts is very similar to the corresponding property for theorems. And just as in the case with theorems, there’s nothing in principle stopping us from moving this condition from "Restrictions" and conjoining it to the output. The difference is that this can always be done for theorems, but it can only be done for concepts representing properties since the output is interpreted as having a truth value:

Finally, the "Output" property here gives the value of the expression Math["IsHausdorff"][X]:

When we use such an expression in a theorem or in the definition of another concept, we interpret it as equivalent to what we see in "Output". As we know, stating and understanding mathematics is much easier when we have such shorthands than if all theorems were stated in terms of atomic symbols and basic axioms.

Two of the most exciting properties on this list are "RelatedConcepts" and "RelatedTheorems". One of our goals is to represent mathematical concepts and theorems in a maximally computable way, and these are just an example of some of the computations we hope to do with these entities. A concept appears in "RelatedConcepts" if it is used in the "Restrictions", "Notation" or "Output" of a concept or the "Restrictions", "Notation" or "Statement" of a theorem. A theorem appears in the "RelatedTheorems" of a concept if that concept appears in the "RelatedConcepts" of that theorem. With this in mind, take a closer look at the examples above:

It is important to emphasize that these relations were not curated, but rather computed, which is possible because of the precise, consistent and expressive language used to encode the concepts and theorems. As a matter of convenience, however, they’ve been precomputed for speed to allow you to, say, easily find the definition of concepts appearing in a theorem.

As an example of the power of this approach, we can use the Wolfram Language’s graph functionality to easily analyze the connectivity and structure of the network of topological theorems and concepts in our corpus:

RGBColor[0.65, 1, 0.65], "GeneralTopologyTheorem" -> RGBColor[1, 1, 0.5] }], #["SummaryGrid"]] & /@ nodes; edges = Join @@ (Flatten[ Thread /@ Normal@EntityValue[#, "RelatedConcepts", "EntityAssociation"]] & /@ domains);" title="domains = {"GeneralTopologyConcept", "GeneralTopologyTheorem"}; nodes = Join @@ (EntityList /@ domains); labelednodes = Tooltip[Style[#, EntityTypeName[#] /. { "GeneralTopologyConcept" -> RGBColor[0.65, 1, 0.65], "GeneralTopologyTheorem" -> RGBColor[1, 1, 0.5] }], #["SummaryGrid"]] & /@ nodes; edges = Join @@ (Flatten[ Thread /@ Normal@EntityValue[#, "RelatedConcepts", "EntityAssociation"]] & /@ domains);" width="620" height="252" class="alignnone size-full wp-image-34372" />

As was the case for topological spaces, a number of extension symbols to the Wolfram Language were introduced in this project. We already encountered the Math and Theorem extensions, but there are also a number of others. For now, they have been placed in a GeneralTopology` context (analogous to the PureMath` context introduced for function spaces). This can be verified by examining the context of such symbols, e.g.:

The motivation behind appending GeneralTopology` to our context path is also now revealed, namely to suppress verbose context formatting in our outputs (so we will see things like Math instead of GeneralTopology`Math). Here is a complete listing of language extensions introduced in the GeneralTopology` context:

Again—as was the case for language extensions introduced for function spaces—some of these may eventually find their way into the Wolfram Language. However, independent of such considerations, these two small projects already show the need for some kind of infrastructure that allows incorporation, sharing and alignment of language extensions from different—and likely independently curated—domains.

We close with some experimental tidbits used to enhance the readability and usability of the concepts and theorems in our entity store. You have probably already noted the nice formatting in "SummaryGrid" and possibly even wondered how it was achieved. The answer is that it was produced using a set of MakeBoxes assignments packaged inside the entity store via the property EntityValue["GeneralTopologyTheorem", "TraditionalFormMakeBoxAssignments"]. Similarly, in order to provide usage messages for the GeneralTopology` symbols (which must be defined prior to having messages associated with them), we have packaged the messages in the special experimental EntityValue["GeneralTopologyTheorem", "Activate"] property, which can be activated as follows:

The result is the instantiation of standard Mathematica-style usage messages such as:

While the eventual implementation details of such features into a standard framework remains the subject of ongoing design and technical discussions, the ease with which it is possible to experiment with such functionality (and to implement semantic representation of mathematical structures in general) is a testament to the power and flexibility of the Wolfram Language as a development and prototyping tool.

Conclusion

These projects undertaken at Wolfram Research during the last year have explored the semantic representation of abstract mathematics. In order to facilitate experimentation with this functionality, we have posted two small notebooks to the cloud (function space entity domain and the topology entity store) that allow interactive exploration and evaluation without the need to install a local copy of Mathematica. We welcome your feedback, comments and even collaboration in these efforts to extend and push the limits of the mathematics that can be represented and computed.

As a final note, we would like to emphasize that significant portions of the work discussed here were carried out as a part of internship projects. If you know or are a motivated mathematics or computer science student who is interested in trying to break new ground in the semantic representation of mathematics, please consider 1) learning the Wolfram Language (which, since you are reading this, you may well have already) and 2) joining the Wolfram internship program next summer!
]>

Protecting NHS Patients with the Wolfram Language

Robert Cook — Fri, 16 Dec 2016 14:36:20 +0000
The UK’s National Health Service (NHS) is in crisis. With a current budget of just over £100 billion, the NHS predicts a £30 billion funding gap by 2020 or 2021 unless there is radical action. A key part of this is addressing how the NHS can predict and prevent harm well in advance and deliver a “digital healthcare transformation” to their frontline services, utilizing vast quantities of data to make informed and insightful decisions.

This is where Wolfram comes in. Our UK-based Technical Services Team worked with the British NHS to help solve a specific problem facing the NHS—one many organizations will recognize: data sitting in siloed databases, with limited analysis algorithms on offer. They wanted to see if it was possible to pull together multiple data sources, combining off-the-shelf clinical databases with the hospital trusts’ bespoke offerings and mine them for signals. We set out to help them answer questions like “Can the number of slips, trips and falls in hospitals be reduced?”

I was assigned by Wolfram to lead the analysis. The databases I was given consisted of about six years’ worth of anonymized data, just over 120 million patient records. It contained a mixture of aggregate averages and patient-level daily observations, drawn from four different databases. While Mathematica is not a database, it has the ability to interface with them easily. I was able to plug into the SQL databases and pull in data from Excel, CSV and text files as needed, allowing us to inspect and streamline the data.

Working closely with a steering committee comprising healthcare professionals, academics and patients, we identified a range of parameters to investigate, including the level of nurse staffing and training, average patient heart rate and the rate of patients suffering from slips and falls. Altogether, the team identified around 1,000 parameter pairings to investigate, far too many to work through by hand in the limited time available.

Some of the tools in the Wolfram Language that made this achievable include:

DatabaseLink and Import

Dataset framework to manipulate the multidimensional data quickly into a usable form

Statistics framework for performing the regression work to identify the interesting trends, and hypothesis testing to ensure their significance

Report generation framework and visualization tools for generating the output and final academic paper

These tools enabled us to rapidly scale up the analysis across this complex dataset, allowing more time to consider the validity of the relationships and signals that emerged. Some of these seemed obvious—wards where patients were more likely to be bed-bound for medical reasons had fewer falls. But not all the signals were this easy to explain. For example, an increase in the number of nurses appeared to be linked to an increase in falls.

This observation seemed surprising. Given that there is little variation in ward size, it seemed unlikely that more nurses would lead to a decrease in patient safety. But not all nurses are equivalent. When we considered the ratio of registered nurses to healthcare support workers, we saw a strong relationship between the increase in highly trained registered nurses and the increase in patient safety.

So we see an increase in falls in some wards that rely more heavily on healthcare support workers. Could these wards be forced to rely on these less qualified, lower-paid nurses when in truth fully licensed, registered nurses are needed? I can only speculate, and the data at this stage is insufficient to answer this question. But following this analysis, the hospital trust in question has changed its staffing policy to increase the level of registered-nurse employment. Whether it leads to an increase in patient safety or a new issue raises its head—we will have to wait and see.

For the full findings, see the paper published this week in BMJ Open.

This project has only started to scrape the surface of the complexities hidden inside this rich dataset. In a mere 10 days, relying on the flexibility designed into the Wolfram Language, we’re able to deliver some insight into this complex problem.

Contact the Wolfram Technical Services group to discuss your data science or coding projects.
]>

Launching Wolfram|Alpha Open Code

Stephen Wolfram — Mon, 12 Dec 2016 18:55:41 +0000

Code for Everyone

Computational thinking needs to be an integral part of modern education—and today I’m excited to be able to launch another contribution to this goal: Wolfram|Alpha Open Code.

Every day, millions of students around the world use Wolfram|Alpha to compute answers. With Wolfram|Alpha Open Code they’ll now not just be able to get answers, but also be able to get code that lets them explore further and immediately apply computational thinking.

It takes a lot of sophisticated technology to make this possible. But to the user, it’s simple. Do a computation with Wolfram|Alpha. Now in almost every section of the output you’ll see an “Open Code” link. Click it and Wolfram|Alpha will generate code for you, then open it in a fully runnable and editable notebook that you can immediately use in the Wolfram Open Cloud:

The sections of the notebook parallel the sections of your Wolfram|Alpha output. But now each section contains not results, but instead core Wolfram Language code needed to get those results. You can run any piece of code by clicking the [>] button (or typing Shift+Enter):

But the really important thing is that right there on the web you can change and extend the code, and then instantly run it again:

The Power of Code

If all someone wants is a single, quick result, then classic Wolfram|Alpha should be all they’ll need. But as soon as they want to go further—that’s where Wolfram|Alpha Open Code comes in.

Let’s say you just got a mathematical result from Wolfram|Alpha:

But then you wonder: “what happens for a whole range of exponents?” Well, it’s going to get pretty complicated to tell Wolfram|Alpha what you want just using natural language. But it’s easy to say what to do by giving a tiny bit of Wolfram Language code (and, yes, you can interactively spin those 3D surfaces around):

You could give code to interactively change the parameters too:

Starting with Wolfram|Alpha, then extending using the Wolfram Language, is very powerful. Here’s what happens with some real-world data. Start in Wolfram|Alpha, then get the underlying Wolfram Language code (it can be made shorter, but then it’s a little less clear what’s going on):

Evaluate the code to get a time series. Then plot it. And divide by the corresponding result for the US:

All, "CurrencyUnit" -> "CurrentUSDollar"}]]/Entity["Country", "UnitedStates"][EntityProperty["Country", "GDP", {"Date" -> All, "CurrencyUnit" -> "CurrentUSDollar"}]],Filling->Axis]" title="DateListPlot[%] DateListPlot[Entity["Country", "Italy"][EntityProperty["Country", "GDP", {"Date" -> All, "CurrencyUnit" -> "CurrentUSDollar"}]]/Entity["Country", "UnitedStates"][EntityProperty["Country", "GDP", {"Date" -> All, "CurrencyUnit" -> "CurrentUSDollar"}]],Filling->Axis]" width="615" height="532" class="alignnone size-full wp-image-34082" />

An important feature of notebooks is that they’re full, computable documents—and you can add whatever you want to them. You can do a whole series of computations. You can put in text to annotate what you’re doing. You can add section headings. You can edit out parts you don’t need. And so on. And of course you can do all of this in the cloud, using any modern web browser.

The Ulterior Motive

Wolfram|Alpha Open Code is going to be really useful to a lot of people—not just students. But when I invented it my immediate objective was very much educational: I wanted to be able to give the millions of students who use Wolfram|Alpha every day a taste of the power of code, and what can be achieved if one learns about code and computational thinking.

Computational thinking is a critically important skill for the future. And after 30 years of development we’re at the exciting point with the Wolfram Language of being able to directly teach serious computational thinking to a very wide range of students. I see Wolfram|Alpha Open Code as opening a window into the world of computational thinking for all the students who use Wolfram|Alpha.

There’s no learning curve to climb with Wolfram|Alpha: you just type in your question, directly in natural language. But now with Wolfram|Alpha Open Code you can explicitly see how your question gets interpreted computationally. And as soon as you want to go further, you’re immediately doing computational thinking, and writing code. You’re not doing an abstract coding exercise, or creating code in some toy context. You’re immediately using code to formulate computational ideas and get results about something you’re working on.

Of course, what makes this feasible is the character of the Wolfram Language—and its uniquely high-level knowledge-based nature. Because that’s what allows real computations that you want to do to be expressed in small amounts of code that can readily be understood and modified or extended.

Yes, the Wolfram Language has a definite structure and syntax, based on definite principles. But that’s a lot of what makes it easy to understand and to write. And in a notebook you’re always getting suggestions about what to type—and if your browser language is set to something other than English you’ll often get annotations in that language too. And the code you get from using Wolfram|Alpha Open Code will continually illustrate the core principles of the Wolfram Language.

Paths into Computational Thinking

Over the course of the past year, we’ve introduced two important paths into computational thinking, both supported by Wolfram Programming Lab, and available free in the Wolfram Open Cloud.

The first path is to start from Explorations: small projects created using code, that a student can immediately dive into, and then modify and interact with. The second path is to systematically learn the Wolfram Language, for example using my book An Elementary Introduction to the Wolfram Language.

And now Wolfram|Alpha Open Code provides a third path: start from a question that a student has asked, and then automatically generate custom code that provides a starting point for further work and thinking.

It’s a nice complement to the other two paths—and perhaps it’ll often provide encouragement to pursue one or the other of them. But it’s a perfectly good path all by itself—and students can go a long way following it.

Of course, under the hood, there’s a remarkable amount of sophisticated technology that’s being used. There’s the whole natural-language understanding system of Wolfram|Alpha that’s understanding the original question. There’s the Wolfram|Alpha computational knowledge system that’s formulating what pieces of code to generate. Then there’s the Wolfram Open Cloud, providing an interactive notebook environment on the web capable of running the code. And at the center of all of it is the Wolfram Language, with its whole integrated design and vast built-in capabilities and knowledge.

It’s taken 30 years of development to get to this point. But now we’ve been able to put everything together to create a very powerful path for students to get into computational thinking.

And I have to say that for me it’s exciting to think about kids out there using Wolfram|Alpha just for homework, but then pressing the Open Code button, and suddenly being transported into the world of code and computational thinking—and perhaps starting on a lifelong journey.

I’m thrilled to be able to provide the tools that make this possible. Try it out. Tell us what you think. Share what you do, and show others what’s possible.

]>

Edit Your NaNoWriMo Novel with the Wolfram Language

Zach Littrell — Fri, 09 Dec 2016 17:38:33 +0000
If you’re like many of us at Wolfram, you probably know that November was National Novel Writing Month (NaNoWriMo). Maybe you even spent the past few weeks feverishly writing, pounding out that coming-of-age story about a lonely space dragon that you’ve been talking about for years.

Congratulations! Now what? Revisions, of course! And we, the kindly Wolfram Blog Team, are here to get you through your revisions with a little help from the Wolfram Language.

By combining the Wolfram Language’s text analysis tools with the Wolfram Knowledgebase’s collection of public-domain novels by authors like Jane Austen and James Joyce, we’ve come up with a few things to help you reflect on your work and see how you measure up to some of the greats.

Literary scholars have been using computational thinking to explore things like genre and emotion for years. Working with large amounts of data, this research gives us a larger picture of things that we can’t discover through reading individual novels, illuminating patterns within the mass of published novels that we would otherwise know little about.

“That’s all well and good,” you might say, “but what about my great (and scandalously unread) dragon bildungsroman?” Well, you’re in luck! You can apply the principles of computational thinking to your writing as well by using the Wolfram Language to help you revise.

Revealing Your Writing Tics

Many writers have things about their writing that they would like to improve. It might be a tendency to overuse adjectives or a penchant for bird metaphors. If you already know your writing tics, it’s easy to find them using your word processor. But what if you don’t know what you’re looking for?

A great way to find unknown writing tics is to use WordCloud, which can help you visualize words’ frequencies and relative importance. We can test this method on Herman Melville’s classic Moby-Dick.

We start by pulling up a list of Melville’s notable books.

Then use WordCloud to create a visualization of word frequency in one of his novels—say, Moby-Dick.

Of the words that appear most often, the titular Moby Dick is a whale, and the narrator reflects frequently on the obsessed ship captain Ahab. But notice something interesting: the word “like” shows up disproportionately—even more than key words such as “ship,” “sea” and “man.” And from inspecting places where he uses the word “like,” we can discover that Melville loves similes:

“like silent sentinels all around the town”
“like leaves upon this shepherd’s head”
“like a grasshopper in a May meadow”
“like a snow hill in the air”
“like a candle moving about in a tomb”
“like a Czar in an ice palace made of frozen sighs”

The similes help bring cosmic grandeur to his epic about a whaling expedition—but it also shows that even the greats aren’t immune to over-reliance on literary devices. As explained in the classic book on writing, The Elements of Style by Strunk and White, similes are handy, but should be used in moderation: “Readers need time to catch their breath; they can’t be expected to compare everything with something else, and no relief in sight.”

Our coworker Kathryn Cramer, a science fiction editor and author with some serious chops, often uses word clouds as an editing tool. She looks at her most frequently used words and asks whether there are any double meanings (what she calls “sinister puns”) that she can develop. She notes that you can also use them to clean up sloppy writing; if she sees too many instances of “then,” she knows that there are too many sentences that use “and then.”

An easy way to find if your word has some double meanings, synonyms, antonyms or any other interesting property is to use the WordData function and see how many different ways you can play with a word like “hand,” for instance.

You can try these techniques out on your own writing using Import. Maybe you could even identify some more writing tics in some of the famous authors whose works are built into the Wolfram Language!

Whom Do You Write Like?

When polishing our prose, many of us often think, “I wish I could write like [insert famous author here].”

For some added fun, in just a few lines of code, we can take a selection of authors—Virginia Woolf, Herman Melville, Frederick Douglass, etc.—and build a ClassifierFunction from their notable books.

Then with a simple FormPage, we built in about half an hour a fun, toy web app called AuthorIdentify that tries to figure out which classic author would most likely have written your text sample.

To test it out, we gave AuthorIdentify the first paragraph of James Joyce’s Finnegans Wake, which was not already in the system. To our delight, it correctly identified the author of the work to be Joyce.

But it’s more fun to let it take a stab at your own work. Our coworker Jeremy Sykes, a publishing assistant here at Wolfram, shared with us a paragraph of his novel, an intergalactic space thriller that combines sci-fi, economics and comedy: Norman Aidlebee, Galactic CPA.

It’s fun trying different samples and playing with them to see if you can get a different author. While far from perfect, our AuthorIdentify is still amusing, even when it’s incredibly wrong.

Feel free to try it out. With some more work, including more text samples, a wider range of authors and playing with the options in Classify, it is simple to build a robust author identification app with the Wolfram Language.

We hope some of these tips and tools help you aspiring writers out there as you sit down and edit your manuscript. We’ve found the Wolfram Language to be an excellent revision buddy, and are constantly on the lookout for new ways to enhance the editing process. So if you have any ideas of your own on how to code with your novel, please feel free to share in the comments!
]>