Tuesday, 24 May 2016

Class 5 is now available

The lessons for Class 5, the last in our course, are now available on the course website:


Class 5 is about scripting. People who use Weka a lot always want to be able to write scripts to do the work, instead of all this clicking. Well, you can! You can

  • write Weka scripts in Python from within the Explorer interface
  • write Weka scripts in Groovy, a Java-based scripting language, in the same way
  • set up the Python Weka wrapper so that you can access the Weka code from within your own Python installation. 
The third option requires to have your own Python installation, which is simple on Linux but not so simple on other operating systems, and that's up to you -- we don't show you how to do it. But the other two options are immediately available from within Weka when you install the corresponding packages. It's easy!

The post-course assessment is also now open. The videos, slides and transcripts will remain available at YouTube, and the "Materials" site:

There is also a post-course survey for your opinions of the MOOC.

We will run all three of these courses, “Data Mining with Weka,” “More Data Mining with Weka,” and
“Advanced Data Mining with Weka,” again, but are not yet sure when

cheers, and enjoy the remainder of the course!



Monday, 16 May 2016

Class 4 is now available

The six lessons for Class 4 are now available on the course website:

In this class we'll learn about distributed processing with Apache Spark (and also with Hadoop). We're assuming that you don't necessarily have access to a cluster computer, but you can still use the framework on a single machine, and the Activities will show you how to get started with this.

The Application lesson 4.6 shows you how to use Weka for image processing, by creating all sorts of different feature sets for your images using the imageFilter package.
Next week is the last. Pretty soon you will be a certified advanced expert in data mining and the use of Weka! Keep at it!


Monday, 9 May 2016

Class 3 is now available

The six lessons for Class 3 are now available on the course website:


After this week there are 2 weeks to go (classes 4 and 5).

The mid-course assessment is also now available. Do it when you have finished Class 2 (although it will remain open for the rest of the course). The final assessment will appear during week 5.

Check your Profile to ensure that your assessment marks have been recorded correctly. Also, check that the name in your Profile is the one you want on your Statement of Completion: as we will use that exact text for the Statements.

Our goal is to enable you to learn as much as possible from this course, and we recognize that doing the assessments may not be a priority for you. However, our ability to mount follow-up MOOCs will depend on the success of this one as perceived by my University -- and the number of people who complete it successfully will be a key metric. Thus I urge you to do the assessments for my sake, if not your own :-)

In this class we'll learn about interfacing to other data mining packages. The first lesson shows you how to access LibSVM and LibLINEAR, and the remaining ones show you how to access some of the many facilities in the popular R statistical computing package. These increase the scope of Weka enormously!

The Application lesson 3.6 shows you how to use Weka to analyze functional MRI Neuroimaging data, and in the Activity you will actually do some of this analysis!

cheers, and keep going!

Monday, 2 May 2016

Class 2 is now available

The six lessons for Class 2 are now available on the course website:


The mid-course assessment, which covers the material up to and including Class 2, is also available. Following that, there are 3 weeks to go (classes 3, 4 and 5).
The mid-course assessment will remain open for the rest of the course; the final assessment will appear during week 5.
The activities are a crucial part of the course: they're where most people will do their actual learning! However, they do not form part of the assessment, so don't be scared to get wrong answers. Also, some of the activities are pretty difficult and time-consuming. You don't necessarily need to actually complete them if you find that difficult on your computer, but you do need to understand what it is that you are supposed to do -- and why.

"Advanced Data Mining with Weka" has been designed so that participants at many different levels can learn as much as possible – and complete the course successfully. All you must do to get the Statement of Completion are the mid-course and final assessments -- which you can try as often as you like. 

This class is about data stream mining, and MOA, Weka's big sister. MOA's algorithms are stream-oriented: they don't keep the dataset in main memory. You can access the algorithms from the Weka interface. But an important aspect of stream-oriented data mining is evaluation: how do you evaluate a learning algorithm that runs continuously on a data stream (which may, in addition, be evolving)? That is what the MOA interface is for, and you will learn about that too.

The Application in Lesson 2.6 is about applying Weka to a problem in bioinformatics, which is a very popular -- and important! -- area for data mining.
cheers, and keep going!

Monday, 25 April 2016

Welcome to "Advanced Data Mining with Weka"

Welcome to the course "Advanced Data Mining with Weka". The six lessons for Class 1 are now available on the course website:

We will release classes 2, 3, 4, and 5 on Mondays (NZ time) in the upcoming weeks, and send reminder announcements.
Weka 3.8 has just been released and you will be using it throughout this course, so please download it and install it on your computer. It’s available at both:



This course includes the following resources:
  • videos, one per lesson, on YouTube
  •  the videos include captions, which can be turned on in YouTube
  •  we recommend viewing in HD format, again a YouTube control
  •  slides used in the videos (PDF format)
  •  text files containing transcripts of the videos
  •  activities that follow each lesson
  • mid-course assessment (opens 2 May, with the Week 2 content) 
  •  final assessment (opens 23 May, with the Week 5 content)
  •  announcement forum, blog, twitter feed (available from the course website)
  •  discussion forum.

Some notes:
  • work through the videos and activities at your own pace, in your own time
  • a new class appears every week; old classes will remain available until the course closes
  • in theory, you could leave all your learning to the last week (but we don't recommend this!)
  • please subscribe to the announcement forum if you haven't already done so: this is the best way to stay up-to-date with the course (click on Membership and email settings to subscribe)
  • only the mid-course and final assessments count towards the Statement of Completion
  • please check your name and marks in the My Profile section of the website (this is the data we will use to produce your Statement of Completion)
  • during the videos, it may help to follow with Weka on your own computer
  • the course should take 3–6 hours/week
  • a detailed syllabus is available:

Please help us by filling out the pre-course survey if you have not already done so.

By the time you have finished this course you will be an advanced expert user of Weka and very knowledgeable about data mining generally. But it will take some effort, and motivation.

cheers, and good luck

Wednesday, 6 April 2016

"Advanced Data Mining with Weka" open for enrolment

Advanced Data Mining with Weka is now open for enrolment, and is scheduled to start on 25 April.

Like the other two Weka MOOCs, this draws on the resources of the Machine Learning Group in the Department of Computer Science at the University of Waikato. It covers:

  • time series forecasting; data stream mining
  • inter-operability with R; scripting Weka in Python and Groovy
  • distributed processing with Apache SPARK and Hadoop
  • application case studies
A detailed syllabus is available.

This is advanced stuff, and you need to be an experienced Weka user before starting. The format is the same as for the earlier courses, and again you will do most of your learning in the Activities, although whether you get a Statement of Completion depends solely on your how well you do in the mid-class and end-of-class assessments.

There’s more information about the course in the trailer video: it’s informative, entertaining, and only about 4 minutes long.

By the time you have finished this course you will be an advanced expert on the use of Weka. Enrol at:


Ian & the Weka Team

Thursday, 3 March 2016

Two Self-paced Courses

Both "Data Mining with Weka" and "More Data Mining with Weka" are now available on a self-paced basis. All the material, activities and assessments are available now until 15th April 2016 at:

We are not providing any tutorial, help or assistance during this session. Also, we will not generate any Statements of Completion until after 15th April.

Ian & the WekaMOOC team