December 4, 2015

Pondering New Ways of Programming Lego EV3 Mindstorms Bricks

We’re due to update our first level residential school day long robotics activity for next year, moving away from the trusty yellow RCX Lego Mindstorms bricks that have served us well for getting on a decade or so, I guess, and up to the EV3 Mindstorms bricks.

Students programmed the old kit via a brilliant user interface developed by my colleague Jon Rosewell, originally for the “Robotics and the Meaning of Life” short course, but soon quickly adopted for residential school, day school, and school-school (sic) outreach activities.

tm129_rob_l1_f04

The left hand side contained a palette of textual commands that could be dragged onto the central canvas in order to create a tree-like programme. Commands could be dragged and relocated within the tree. Commands taking variable values had the values set by selecting the appropriate line of code and then using the dialogue in the bottom left corner to set the value.

The programme could be downloaded and executed on the RCX brick, or used to control the behaviour of the simple simulated robot in the right hand panel. A brilliant, brilliant piece of educational software. (A key aspect of this is how easy it is to support in a lab or classroom with a dozen student groups who need help debugging their programmes. The UI is clear, easily seen over the shoulder, and fixes to buggy code can typically be easily be fixed with a word or two of explanation. The text-but-not metaphor reduces typos (it’s really a drag and drop UI but with text blocks rather than graphical blocks) as well as producing pretty readable code.

For the new residential school, we’ve been trying to identify what makes sense software wise. The default Lego software is based on Labview, but I think it looks a bit toylike (which isn’t necessarily a problem) but IMHO could be hard to help debug in a residential school setting, which probably is an issue. “Real” LabView can also be used to program the bricks (I think), but again the complexity of the software, and similar issues in quick-fire debugging, are potential blockers. Various third party alternatives to the Lego software are possible: LeJOS, a version of Java that’s been running on Mindstorms bricks for what feels like forever is one possibility; ev3dev is another, a Linux distribution for the brick that lets you run things like Python, and the python-ev3 python package is another. You can also run an IPython notebook from the brick – that is, the IPython notebook server runs on the brick and you can then access the notebook via a browser running on a machine with a network connection to the brick…

So as needs must (?!;-), I spent a few hours today setting up an EV3 with ev3dev, python-ev3 and an IPython notebook server. Following along the provided instructions, everything seemed to work okay with a USB connection to my Mac, including getting the notebooks to auto-run on boot, but I couldn’t seem to get an ssh or http connection with a bluetooth connection. I didn’t have a nano-wifi dongle either, so I couldn’t try a wifi connection.

The notebooks seem to be rather slow when running code cells, although responsiveness when I connected to the brick via an ssh terminal from my mac seemed pretty good for running command line commands at least. Code popped into an executable, shebanged python file can be run from the brick itself simply by selecting the file from the on-board file browser, so immediately a couple of possible workflows are possible:

programme the brick via an IPython notebook running on the brick, executing code a cell at a time to help debug it;
write the code somewhere, pop it into a text file, copy it onto the brick and then run it from the brick;

It should also be possible to export the code from a notebook into an executable file that could be run from the on-brick file browser.

Another option might be to run IPython on the brick, accessed from an ssh terminal, to support interactive development a line at a time:

ev3dev

This seems to be pretty quick/responsive, and offers features such as autocomplete prompts, though perhaps not as elegantly as the IPython notebooks manage.

However, the residential school activities require students to write complete programmes, so the REPL model of the interactive IPython interpreter is perhaps not the best environment?

Thinking more imaginatively about setting, if we had wifi working, and with a notebook server running on the brick, I could imagine programming and interacting with the brick from an IPython notebook accessed via a browser on an tablet (assuming it’s easy enough to get network connections working over wifi?) This could be really attractive for opening up how we manage the room for the activity, because it would mean we could get away from the computer lab/computer workstation model for each group and have a far more relaxed lab setting. The current model has two elbow height demonstration tables about 6′ x 3’6 around which students gather for mildly competitive “show and tell” sessions, so having tablets rather than workstations for the programming could encourage working directly around the tables as well?

That the tablet model might be interesting to explore originally came to me when I stumbled across the Microsoft Touch Develop environment, which provides a simple programming environment with a keyboard reminiscent of that of a ZX Spectrum with single keyboard keys inserting complete text commands.

unique_script____main_-_TouchDevelop

Sigh… those were the days…:

zx_spectrum_keyboard_-_Google_Search

Unfortunately there doesn’t seem to be an EV3 language pack for Touch Develop:-(

However, there does appear to be some activity around developing a Python editor for use in Touch Develop, albeit just a straightforward text editor:

As you may have noticed, this seems to have been developed for use with the BBC Microbit, which will be running MicroPython, a version of Python3 purpose built for microcontrollers (/via The Story of MicroPython on the BBC micro:bit).

It’s maybe worth noting that TouchDevelop is accessed via a browser and can be run in the cloud or locally (touchdevelop local).

We’re currently also looking for a simple Python programming environment for a new level 1 course, and I wonder if something of this ilk might be appropriate for that…?

Finally, another robot related ecosystem that crossed my path this week, this time via @Downes – the Poppy Project, which proudly declares itself as “an open-source platform for the creation, use and sharing of interactive 3D printed robots”. Programming is via pypot, a python library that also works with the (also new to me) V-REP virtual robot experimentation platform, a commercial environment though it does seem to have a free educational license. (The poppy project folk also seem keen on IPython notebooks, auto-running them from the Raspberry Pi boards used to control the poppy project robots, not least as a way of sharing tutorials.)

I half-wondered if this might be relevant for yet another new course, this time at level 2, on electronics – though it will also include some robotics elements, including access (hopefully) to real robots via a remote lab. These will be offered as part of the OU’s OpenSTEM Lab which I think will be complementing the current, and already impressive, OpenScience Lab with remotely accessed engineering experiments and demonstrations.

Let’s just hope we can get a virtual computing lab opened too!

December 3, 2015

Tinkering With FutureLearn Data – Rebasing Time

As a member of an organisation where academics tend to be course designers and course producers, and kept as far away from students as possible (Associate Lecturers handle delivery as personal tutors and personal points of contact), I’ve never really got my head around what “learning analytics” is supposed to deliver: it always seemed far more useful to me to think about course analytics as way of tracking how the course materials are working and whether they seem to be being used as intended. Rather than being interested in particular students, the emphasis would be more on how a set of online course materials work in much the same way as tracking how any website works. Which is to say, are folk going to the pages you expect, spending the time on them you expect, reaching goal pages as and when you expect, and so on.

Having just helped out on a FutureLearn course, I was allowed to have a copy of the course related data files FutureLearn makes available to partners:

enrolments: hashed learner id, datetime of enrolment onto the course, when they unenrolled (if they did), when they “fully participated”, what their role was (learner, educator, etc);
step activity: datetime of first visit to, and last completion of, a step, along with step and hashed learner details;
comments: datetime of comment, learner id, comment id, whether it was in response to a comment id, step;
question response: datetime of a question response, step, question number, answer option, correct/incorrect flag, hashed learner id

The course was on learning to code for data analysis using the Python pandas library, so I thought I’d try to apply what was covered in the course (perhaps with a couple of extra tricks…) to the data that flowed from the course…

And here’s one of the tricks… rebasing (normalising) time.

For example, one of the things I was interested in was how long learners were spending on particular steps and particular weeks on the one hand, and how long their typical study sessions were on the other. This could then all be aggregated to provide some course stats about loading which could feed back into possible revisions of the course material, activity design (and redesign) etc.

Here’s an example of how a randomly picked learner progressed through the course:

fl-stintByLearner

The horizontal x-axis is datetime, the vertical y axis is an encoding of the week and step number, with clear separation between the weeks and steps within a week incrementally ordered. The points show the datetime at which the learner first visited the step. The points are coloured by “stint”, a trick I borrowed from my F1 data wrangling stuff: during the course of a race, cars complete several “stints”, where a stint corresponds to a set laps completed on a particular set of tyres; analysing races based on stints can often turn up interesting stories…

To identify separate study session (“stints”) I used a simple heuristic – if the gap between start-times of consecutively studied stints exceeded a certain threshold (55 minutes, say), then I assumed that the steps were considered in separate study sessions. This needs a bit of tweaking, possibly, perhaps including timestamps from comments or question responses that can intrude on long gaps to flag them as not being breaks in study, or perhaps making the decision about whether the gap between two steps is actually a long one compared to a typically short median time for that step? (There are similar issues in the F1 data, for example when trying to work out whether a pit stop may actually be a drive-through penalty rather than an actual stop.)

In the next example, I rebased the time for two learners based on the time they first encountered the first step of the course. That is, the “learner time” (in hours) is the time between them first seeing a particular step, and the time they first saw their first step. The colour field distiguishes between the two learners.

fl-stintByLearner1a

We can draw on the idea of “stints”, or learner sessions further, and use the earliest time within a stint to act as the origin. So for example, for another random learner, here we see an incremental encoding on of the step number on the y-axis, with the weeks clearly separated, the “elapsed study session time” along the horizontal y-axis, and the colour mapping out the different study sessions.

fl-stintByLearner2

The spacing on the y-axis needs sorting out a bit more so that it shows clearer progression through steps, perhaps by using an ordered categorical axis with a faint horizontal rule separator to distinguish the separate weeks. (Having an interactive pop-up that contains some information the particular step each mark refers to, as well as information about how much time was spent on it, whether there was commenting activity, etc, what the mean and median study time for the step is, etc etc, could also be useful.) However, I have to admit that I find charting in pandas/matplotlib really tricky, and only seem to have slightly more success with seaborn; I think I may need to move this stuff over to R so that I can make use of ggplot, which I find far more intuitive…

Finally, whilst the above charts are at the individual learner level, my motivation for creating them was to better understand how the course materials were working, and to try to get my eye in to some numbers that I could start to track as aggregate numbers (means, medians, etc) over the course as a whole. (Trying to find ways of representing learner activity so that we could start to try to identify clusters or particular common patterns of activity / signatures of different ways of studying the course, is then another whole other problem – though visual insights may also prove helpful there.)

December 3, 2015

Some Jupyter Notebook / nbconvert Housekeeping Hints

A few snippets and bits of pieces regarding housekeeping around Jupyter notebooks.

Clearing Output Cells

Via Matthias Bussonnier, this handy command for rendering a version of a notebook with a the output cells cleared:

jupyter nbconvert --to notebook --ClearOutputPreprocessor.enabled=True NOTEBOOK.ipynb

Adding --inplace will rewrite the notebook with cleared output cells.

Custom Templates

If you have a custom template or a custom config file in the current directory, you can invoke them using:

jupyter nbconvert --config=my_config.py NOTEBOOK.ipynb
jupyter nbconvert --template=my_template.tpl NOTEBOOK.ipynb

I found running the --log-level=DEBUG flag was also handy…

Via MinRk, additional paths can be set using c.TemplateExporter.template_path.append('/path/to/templates') though I’m not really sure where that setting needs to be applied. (Whilst I love the Jupyter project, I really struggle to keep track of where things are supposed to be located and which bits are working/don’t work anymore:-(

He also notes that [a]bsolute template paths will also work if you specify: c.TemplateExporter.template_path.append('/'), adding the comment that [a]bsolute paths for templates should probably work without modifying template_path, but they don’t right now.

It would be really handy if the ability to specify an absolute path in the command line setting did work out of the can…

Split a Long Notebook into Multiple Sub-Notebooks

The Jupyter notebooks allow you to split long cells into two cells using the cursor as a split point, but how about splitting a long notebook into multiple notebooks?

The following test script will split a notebook into sub-notebooks at an explicit split point – a markdown cell containing just the string SPLIT NOTEBOOK.

import io, sys
import IPython.nbformat as nb
import IPython.nbformat.v4.nbbase as nb4

mynb=nb.read('TEST_LONG_NOTEBOOK.ipynb',nb.NO_CONVERT)
#Partition a long notebook into subnotebooks at specificed split points
#Enter SPLIT NOTEBOOK on it's own in a markdown cell to specify a split point
c=1
test=nb4.new_notebook()
for i in mynb['cells']:
    if (i['cell_type']=='markdown'):
        if ('SPLIT NOTEBOOK' in i['source']):
            nb.write(test,'subNotebook{}.ipynb'.format(c))
            c=c+1
            test=nb4.new_notebook()
        else:
            test.cells.append(nb4.new_markdown_cell(i['source']))
    elif (i['cell_type']=='code'):
        cc=nb4.new_code_cell(i['source'])
        for o in i['outputs']:
            cc['outputs'].append(o)
        test.cells.append(cc)
nb.write(test,'subNotebook{}.ipynb'.format(c))

I should probably tidy this up so that it reuses the original notebook name rather than the subNotebook stub. It might also be handy to turn this into a notebook extension that lets splits the current notebook into two relative to the current cursor location (e.g. all cells above the selected cell go in one sub-notebook, everything from the selected cell to the end of the notebook going into a second sub-notebook.

Another refinement might allow for the declaration of a comment set of cells to prefix each sub-notebook. By default, this could be the set of cells prior to the first split point. (Which is to say for N split points, there would be N, rather than N+1, sub-notebooks, the cells above the first split point appearing in each sub-notebook. The first sub-notebook would thus contain cells starting with the first cell after the first split point, prefixed by the cells appearing before the first split point; and the last sub-notebook would contain the cells starting with the first cell after the last split point, prefixed once again by the cells appearing before the first split point.

A second utility to merge, or concatenate, two or more notebooks when provided with their filenames might also be handy…

Anything Else?

So what other handy Jupyter notebook / nbconvert housekeeping hints and tricks am I missing?

November 28, 2015

Some Random Upstart Debugging Notes…

…just so I don’t lose them…

dmesg spews out messages a Linux kernel has been issuing… which should also appear in /var/log/syslog (h/t Rod Norfor)

/var/log/upstart/SERVICE.log has log messages from trying to start a service SERVICE.

/etc/init.d should contain what looks like a generic sort of file, filename SERVICE, with the actual config file that contains the command you want to start the service in a file SERVICE.conf in /etc/init.

To generate the files that will have a go at auto-running the service, run the command update-rc.d SERVICE defaults.

Start a service with service SERVICE start, stop it with service SERVICE stop, and restart (stop if started, then start) with service SERVICE restart. Find out what’s going on with it using service SERVICE status.

Maybe…

November 24, 2015

Course Management and Collaborative Jupyter Notebooks via SageMathCloud

Prompted by a joint ~~course~~module team to look at options surrounding a “virtual computing lab” to support a couple of new level 1 (first year equivalent) IT and computing courses (they should know better?!;-), I had another scout around and came across SageMathCloud, which looks at first glance to be just magical:-)

An open source, cloud hosted system [code], the free plan allows users to log in with social media credentials and create their own account space:

SageMathCloud

Once you’re in, you have a project area in which you can define different projects:

Projects_-_SageMathCloud I’m guessing that projects could be used by learners to split out different projects with a course, or perhaps use a project as the basis for a range of activities within a course.

Within a project, you have a file manager:

My_first_project_-_SageMathCloud

The file manager provides a basis for creating application-linked files; of particular interest to me is the ability to create Jupyter notebooks…

My_first_project_-_SageMathCloud2

Jupyter Notebooks

Notebook files are opened in to a tab. Multiple notebooks can be open in multiple tabs at the same time (though this may start to hit performance from the server? pandas dataframes, for example, are held in memory, and the SMC default plan could mean memory limits get hit if you try to hold too much data in memory at once?)?

My_first_project_-_SageMathCloud3

Notebooks are autosaved regularly – and a time slider that allows you to replay and revert to a particular version is available, which could be really useful for learners? (I’m not sure how this works – I don’t think it’s a standard Jupyter offering? I also imagine that the state of the underlying Python process gets dislocated from the notebook view if you revert? So cells would need to be rerun?)

My_first_project_-_SageMathCloud4

Collaboration

Several users can collaborate on a project. I created another me by creating an account using a different authentication scheme (which leads to a name clash – and I think an email clash – but SMC manages to disambiguate the different identities).

My_first_project_-_SageMathCloud5

As soon as a collaborator is added to a project, they share the project and the files associated with the project.

Projects_-_SageMathCloud_and_My_first_project_-_SageMathCloud

Live collaborative editing is also possible. If one me updates a notebook, the other me can see the changes happening – so a common notebook file is being updated by each client/user (I was typing in the browser on the right with one account, and watching the live update in the browser on the left, authenticated using a different account).

My_first_project_-_SageMathCloud_and_My_first_project_-_SageMathCloud

Real-time chatrooms can also be created and associated with a project – they look as if they might persist the chat history too?

_1__My_first_project_-_SageMathCloud_and_My_first_project_-_SageMathCloud

Courses

The SagMathCloud environment seems to have been designed by educators for educators. A project owner can create a course around a project and assign students to it.

My_first_project_-_SageMathCloud_1 (It looks as if students can’t be collaborators on a project, so when I created a test course, I uncollaborated with my other me and then added my other me as a student.)

My_first_project_-_SageMathCloud_2

An course folder appears in the project area of the student’s account when they are enrolled on a course. A student can add their own files to this folder, and inspected by the course administrator.

Projects_-_SageMathCloud_and_My_first_project_-_SageMathCloud_3

A course administrator can also add one or more of their other project folders, by name, as assignment folders. When an assignment folder is added to a course and assigned to a student, the student can see that folder, and its contents, in their corresponding course folder, where they can then work on the assignment.

student_-_2015-11-24-135029_-_SageMathCloud_and_My_first_project_-_SageMathCloud

The course administrator can then collect a copy of the student’s assignment folder and its contents for grading.

My_first_project_-_SageMathCloud_9

The marker opens the folder collected from the student, marks it, and may add feedback as annotations to the notebook files, returning the marked assignment back to the student – where it appears in another “graded” folder, along with the grade.

Tony_Hirst_-_2015-11-24-135029_-_SageMathCloud_and_My_first_project_-_SageMathCloud

Summary

At first glance, I have to say I find this whole thing pretty compelling.

In an OU context, it’s easy enough imagining that we might sign up a cohort of students to a course, and then get them to add their tutor as a collaborator who can then comment – in real time – on a notebook.

A tutor might also hold a group tutorial by creating their own project and then adding their tutor group students to it as collaborators, working through a shared notebook in real time as students watch on in their own notebooks, and perhaps may direct contributions back in response to a question from the tutor.

(I don’t think there is an audio channel available within SMC, so that would have to be managed separately? [UPDATE: seems there is some audio support – via William Stein, “if you click on the chat to the right of most file types (e.g., make a .md file), then there is a video camera, and if you click on that, you can broadcast yourself to other viewers of the file”.])

Wishlist

So what else would be nice? I’ve already mentioned audio collaboration, though that’s not essential and could be easily managed by other means.

For a course like TM351, it would be nice to be able to create a composition of linked applications within a project – for example, it would be nice to be able to start a PostgreSQL or MongoDB server linked to the Jupyter server so that notebooks could interact directly with a DBMS within a project or course setting. I also note that the IPython kernel being used appears to be the 2.7 version, and wonder how easy it is to tweak the settings on the back-end, or via an administration panel somewhere, to enable other Jupyter kernels?

I also wonder how easy it would be to add in other applications that are viewable through a browser, such as OpenRefine or RStudio?

In terms of how the backend works, I wonder if the Sandstorm.io encapsulation would be useful (eg in context of Why doesn’t Sandstorm just run Docker apps?) compared to a simpler docker container model, if that indeed is what is being used?

November 12, 2015

Are Robots Threatening Jobs or Are We Taking Them Ourselves Through Self-Service Automation?

Via several tweets today, a story in the Guardian declaring Robots threaten 15m UK jobs, says Bank of England’s chief economist:

The Bank of England has warned that up to 15m jobs in Britain are at risk of being lost to an age of robots where increasingly sophisticated machines do work that was previously the preserve of humans.

The original source appears to be a speech (“Labour’s Share”) given by Andrew G Haldane, Chief Economist of the Bank of England to the Trades Union Congress, London, 12 November 2015 and has bits and pieces in common with recent reports such as this one on The Future of Employment: how susceptible are jobs to computerisation? or this one asking Are Robots Taking Our Jobs, or Making Them?, or this on The new hire: How a new generation of robots is transforming manufacturing, or this collection of soundbites collected by Pew, or this report from a robotics advocacy group on the Positive Impact of
Industrial Robots on Employment. (Lots of consultancies and industry lobby groups seem to have been on the robot report bandwagon lately…) There’s also been a recent report that seems to have generated some buzz lately from Bank of America/Merrill Lynch on Creative Disruption, which also picks up on several trends in robotics.

But I wonder – is it robots replacing jobs through automating out, or robots replacing jobs by transferring work from the provider of a service or good directly on to the consumer, turning customers into unpaid employees? That is, what proportion of these robots actually self-service technologies (SSTs)? So for example, have you ever:

used a self-service checkout in a supermarket rather than waiting in line for a cashier to scan your basketload of goods, let alone bought a bag of crisps or bottle of water from a (self-service) vending machine?
used a self-service banking express machine or kiosk to pay in a cheque, let alone used an ATM to take cash out?
used a self-service library kiosk to scan out a library book?
used a self-service check-in kiosk or self-service luggage drop off in an airport?
used a self-service ticket machine to buy a train ticket?
collected goods from a (self-service) Amazon locker?
commented in a “social learning” course to support a fellow learner?
etc etc

Who’s taken the ~~job~~work there? If you scan it yourself, you’re an unpaid employee…

November 5, 2015

Launch Docker Container Compositions via Tutum and Stackfiles.io – But What About Container Stashing?

Via a couple of tweets, it seems that 1-click launching of runnable docker container compositions to the cloud is almost possible with Tutum – deploy to Tutum button [h/t @borja_burgos] – with collections of ready–to-go compositions (or in Tutum parlance, stacks) available on stackfiles.io [h/t @tutumcloud].

The deploy to Tutum button is very much like the binder setup, with URLs taking the form:

https://dashboard.tutum.co/stack/deploy/?repo=REPO_URL

The repository – such as a github repository – will look for tutum.yml, docker-compose.yml and fig.yml files (in that order) and pre-configure a Tutum stack dialogue with the information described in the file.

The stack can then be deployed to a one or more already running nodes.

The stackfiles.io site hosts a range of pre-defined configuration files that can be used with the deploy button, so in certain respects it acts much the same way as a the panamax directory (Panamax marketplace?)

One of the other things I learned about Tutum is that they have a container defined that can cope with load balancing: if you launch multiple container instances of the same docker image, you can load balance across them (tutum: load balancing a web service). At least one of the configurations on stackfiles.io (Load balancing a Web Service) seems to show how to script this.

One of the downsides of the load balancing, and indeed the deploy to Tutum recipe generally is that there doesn’t seem to be a way to ensure that server nodes on which to run the containers are available: presumably, you have to start these yourself?

What would be nice would be the ability to also specify an autoscaling rule that could be used to fire up at least one node on which to run a deployed stack? Autoscaling rules would also let you power up/power down server nodes depending on load, which could presumably keep the cost of running servers down to a minimum needed to actually service whatever load is being experienced. (I’m thinking of occasional, and relative low usage models, which are perhaps also slightly different from a normal web scaling model. For example, the ability to fire up a configuration of several instances of OpenRefine for a workshop, and have autoscaling cope with deploying additional containers (and if required, additional server nodes) depending on how many people turn up to the workshop or want to participate).)

There seems to be a discussion thread about autoscaling on the Tutum site, but I’m not sure there is actually a corresponding service offering? (Via @tutumcloud, there is a webhook triggered version: Tutum triggers.)

One final thing that niggles at me particularly in respect of personal application hosting is the ability to “stash” a copy of a container somewhere so that it can be reused later, rather than destroying containers after each use. (Sandstorm.io appears to have this sorted…) A major reason for doing this would be to preserve user files. I guess one way round it is to use a linked data container and then keep the server node containing that linked data container alive, in between rounds of destroying and starting up new application containers (application containers that link to the data container to store user files). The downside of this is that you need to keep a server alive – and keep paying for it.

What would be really handy would be the ability to “stash” a container in some cheap storage somewhere, and then retrieve that container each time someone wanted to run their application (this could be a linked data container, or it could be the application container itself, with files preserved locally inside it?) (Related: some of my earlier notes on how to share docker data containers.)

I’m not sure whether there are any string’n’glue models that might support this? (If you know of any, particularly if they work in a Tutum context, please let me know via the comments…)

OUseful.Info, the blog...