Tuesday, 24 May 2016

Class 5 is now available

The lessons for Class 5, the last in our course, are now available on the course website:


Class 5 is about scripting. People who use Weka a lot always want to be able to write scripts to do the work, instead of all this clicking. Well, you can! You can

  • write Weka scripts in Python from within the Explorer interface
  • write Weka scripts in Groovy, a Java-based scripting language, in the same way
  • set up the Python Weka wrapper so that you can access the Weka code from within your own Python installation. 
The third option requires to have your own Python installation, which is simple on Linux but not so simple on other operating systems, and that's up to you -- we don't show you how to do it. But the other two options are immediately available from within Weka when you install the corresponding packages. It's easy!

The post-course assessment is also now open. The videos, slides and transcripts will remain available at YouTube, and the "Materials" site:

There is also a post-course survey for your opinions of the MOOC.

We will run all three of these courses, “Data Mining with Weka,” “More Data Mining with Weka,” and
“Advanced Data Mining with Weka,” again, but are not yet sure when

cheers, and enjoy the remainder of the course!



Monday, 16 May 2016

Class 4 is now available

The six lessons for Class 4 are now available on the course website:

In this class we'll learn about distributed processing with Apache Spark (and also with Hadoop). We're assuming that you don't necessarily have access to a cluster computer, but you can still use the framework on a single machine, and the Activities will show you how to get started with this.

The Application lesson 4.6 shows you how to use Weka for image processing, by creating all sorts of different feature sets for your images using the imageFilter package.
Next week is the last. Pretty soon you will be a certified advanced expert in data mining and the use of Weka! Keep at it!


Monday, 9 May 2016

Class 3 is now available

The six lessons for Class 3 are now available on the course website:


After this week there are 2 weeks to go (classes 4 and 5).

The mid-course assessment is also now available. Do it when you have finished Class 2 (although it will remain open for the rest of the course). The final assessment will appear during week 5.

Check your Profile to ensure that your assessment marks have been recorded correctly. Also, check that the name in your Profile is the one you want on your Statement of Completion: as we will use that exact text for the Statements.

Our goal is to enable you to learn as much as possible from this course, and we recognize that doing the assessments may not be a priority for you. However, our ability to mount follow-up MOOCs will depend on the success of this one as perceived by my University -- and the number of people who complete it successfully will be a key metric. Thus I urge you to do the assessments for my sake, if not your own :-)

In this class we'll learn about interfacing to other data mining packages. The first lesson shows you how to access LibSVM and LibLINEAR, and the remaining ones show you how to access some of the many facilities in the popular R statistical computing package. These increase the scope of Weka enormously!

The Application lesson 3.6 shows you how to use Weka to analyze functional MRI Neuroimaging data, and in the Activity you will actually do some of this analysis!

cheers, and keep going!

Monday, 2 May 2016

Class 2 is now available

The six lessons for Class 2 are now available on the course website:


The mid-course assessment, which covers the material up to and including Class 2, is also available. Following that, there are 3 weeks to go (classes 3, 4 and 5).
The mid-course assessment will remain open for the rest of the course; the final assessment will appear during week 5.
The activities are a crucial part of the course: they're where most people will do their actual learning! However, they do not form part of the assessment, so don't be scared to get wrong answers. Also, some of the activities are pretty difficult and time-consuming. You don't necessarily need to actually complete them if you find that difficult on your computer, but you do need to understand what it is that you are supposed to do -- and why.

"Advanced Data Mining with Weka" has been designed so that participants at many different levels can learn as much as possible – and complete the course successfully. All you must do to get the Statement of Completion are the mid-course and final assessments -- which you can try as often as you like. 

This class is about data stream mining, and MOA, Weka's big sister. MOA's algorithms are stream-oriented: they don't keep the dataset in main memory. You can access the algorithms from the Weka interface. But an important aspect of stream-oriented data mining is evaluation: how do you evaluate a learning algorithm that runs continuously on a data stream (which may, in addition, be evolving)? That is what the MOA interface is for, and you will learn about that too.

The Application in Lesson 2.6 is about applying Weka to a problem in bioinformatics, which is a very popular -- and important! -- area for data mining.
cheers, and keep going!